Abstract
We present a novel Bayesian model for semi-supervised part-of-speech tagging. Our model extends the Latent Dirichlet Allocation model and incorporates the intuition that words' distributions over tags, p(t|w), are sparse. In addition we introduce a model for determining the set of possible tags of a word which captures important dependencies in the ambiguity classes of words. Our model outperforms the best previously proposed model for this task on a standard dataset.
Original language | English |
---|---|
Title of host publication | Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference |
Editors | John C. Platt, Daphne Koller, Yoram Singer, Sam T. Roweis |
Place of Publication | La Jolla, California |
Pages | 1464-1471 |
Number of pages | 8 |
Publication status | Published - 2009 |
Externally published | Yes |
Event | 21st Annual Conference on Neural Information Processing Systems, NIPS 2007 - Vancouver, BC, Canada Duration: 3 Dec 2007 → 6 Dec 2007 |
Other
Other | 21st Annual Conference on Neural Information Processing Systems, NIPS 2007 |
---|---|
Country/Territory | Canada |
City | Vancouver, BC |
Period | 3/12/07 → 6/12/07 |