Unsupervised learning of multi-word verbs

Don Blaheta, Mark Johnson

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review


Collocation is a linguistic phenomenon that is difficult to define and harder to explain; it has been largely overlooked in the field of computational linguistics due to its difficulty. Although standard techniques exist for finding collocations, they tend to be rather noisy and suffer from sparse data problems. In this paper, we demonstrate that by utilising parsed input to concentrate on one very specific type of collocation—in this case, verbs with particles, a subset of the socalled “multi-word” verbs—and applying an algorithm to promote those collocations in which we have more confidence, the problems with statistically learning
collocations can be overcome.
Original languageEnglish
Title of host publicationProceedings of the ACL 2001 workshop on collocation
Subtitle of host publicationcomputational extraction, analysis and exploitation
Place of PublicationToulouse, France
PublisherAssociation for Computational Linguistics (ACL)
Number of pages7
Publication statusPublished - 2001
Externally publishedYes
EventAssociation for Computational Linguistics Workshop on Collocation (2001) - Toulouse, France
Duration: 9 Jul 200111 Jul 2001


ConferenceAssociation for Computational Linguistics Workshop on Collocation (2001)


Dive into the research topics of 'Unsupervised learning of multi-word verbs'. Together they form a unique fingerprint.

Cite this