Abstract
We present a novel Bayesian model for semi-supervised part-of-speech tagging. Our model extends the Latent Dirichlet Allocation model and incorporates the intuition that words' distributions over tags, p(t|w), are sparse. In addition we introduce a model for determining the set of possible tags of a word which captures important dependencies in the ambiguity classes of words. Our model outperforms the best previously proposed model for this task on a standard dataset.
| Original language | English |
|---|---|
| Title of host publication | Advances in Neural Information Processing Systems 20 - Proceedings of the 2007 Conference |
| Editors | John C. Platt, Daphne Koller, Yoram Singer, Sam T. Roweis |
| Place of Publication | La Jolla, California |
| Pages | 1464-1471 |
| Number of pages | 8 |
| Publication status | Published - 2009 |
| Externally published | Yes |
| Event | 21st Annual Conference on Neural Information Processing Systems, NIPS 2007 - Vancouver, BC, Canada Duration: 3 Dec 2007 → 6 Dec 2007 |
Other
| Other | 21st Annual Conference on Neural Information Processing Systems, NIPS 2007 |
|---|---|
| Country/Territory | Canada |
| City | Vancouver, BC |
| Period | 3/12/07 → 6/12/07 |