Abstract
Learning to group words into phrases without supervision is a hard task for NLP systems, but infants routinely accomplish it. We hypothesize that infants use acoustic cues to prosody, which NLP systems typically ignore. To evaluate the utility of prosodic information for phrase discovery, we present an HMM-based
unsupervised chunker that learns from only transcribed words and raw acoustic correlates to prosody. Unlike previous work on unsupervised parsing and chunking, we use neither gold standard part-of-speech tags nor punctuation in the input. Evaluated on the Switchboard corpus, our model outperforms several baselines that exploit either lexical or prosodic information alone, and, despite producing a flat structure, performs competitively with a state-of-the-art unsupervised lexicalized
parser, with a substantial advantage in precision. Our results support the hypothesis
that acoustic-prosodic cues provide useful evidence about syntactic phrases for language-learning infants.
Original language | English |
---|---|
Title of host publication | Proceedings of the 2nd workshop on cognitive modeling and computational linguistics |
Editors | Frank Keller, David Reitter |
Place of Publication | Madison, WI, USA |
Publisher | Association for Computational Linguistics |
Pages | 20-29 |
Number of pages | 10 |
ISBN (Print) | 9781932432954 |
Publication status | Published - 2011 |
Externally published | Yes |
Event | Workshop on cognitive modeling and computational linguistics - Portland, Oregon, USA Duration: 23 Jun 2011 → 23 Jun 2011 |
Workshop
Workshop | Workshop on cognitive modeling and computational linguistics |
---|---|
City | Portland, Oregon, USA |
Period | 23/06/11 → 23/06/11 |