Abstract
Stress is a useful cue for English word segmentation. A wide range of computational models have found that stress cues enable a 2-10% improvement in segmentation accuracy, depending on the kind of model, by using input that has been annotated with stress using a pronouncing dictionary. However, stress is neither invariably produced nor unambiguously identifiable in real speech. Heavy syllables, i.e. those with long vowels or syllable codas, attract stress in English. We devise Adaptor Grammar word segmentation models that exploit either stress, or syllable weight, or both, and evaluate the utility of syllable weight as a cue to word boundaries. Our results suggest that syllable weight encodes largely the same information for word segmentation in English that annotated dictionary stress does.
Original language | English |
---|---|
Title of host publication | EMNLP 2014 - 2014 Conference on Empirical Methods in Natural Language Processing, Proceedings of the Conference |
Place of Publication | Stroudsburg, PA |
Publisher | Association for Computational Linguistics (ACL) |
Pages | 844-853 |
Number of pages | 10 |
ISBN (Electronic) | 9781937284961 |
DOIs | |
Publication status | Published - 2014 |
Event | 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014 - Doha, Qatar Duration: 25 Oct 2014 → 29 Oct 2014 |
Other
Other | 2014 Conference on Empirical Methods in Natural Language Processing, EMNLP 2014 |
---|---|
Country/Territory | Qatar |
City | Doha |
Period | 25/10/14 → 29/10/14 |