Abstract
We present a simple architecture for parsing transcribed speech in which an edited-word detector first removes such words from the sentence string, and then a standard statistical parser trained on transcribed speech parses the remaining words. The edit detector achieves a misclassification rate on edited words of 2.2%. (The NULL-model, which marks everything as not edited, has an error rate of 5.9%.) To evaluate our parsing results we introduce a new evaluation metric, the purpose of which is to make evaluation of a parse tree relatively indifferent to the exact tree position of EDITED nodes. By this metric the parser achieves 85.3% precision and 86.5% recall.
| Original language | English |
|---|---|
| Title of host publication | Proceedings of the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics |
| Subtitle of host publication | NAACL 2001 |
| Place of Publication | Stroudsburg, PA |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 118-126 |
| Number of pages | 9 |
| DOIs | |
| Publication status | Published - 2001 |
| Externally published | Yes |
| Event | Meeting of the North American Chapter of the Association for Computational Linguistics (2nd : 2001) - Pittsburgh, United States Duration: 1 Jun 2001 → 7 Jun 2001 |
Conference
| Conference | Meeting of the North American Chapter of the Association for Computational Linguistics (2nd : 2001) |
|---|---|
| Abbreviated title | NAACL '01 |
| Country/Territory | United States |
| City | Pittsburgh |
| Period | 1/06/01 → 7/06/01 |
Bibliographical note
Copyright the Publisher 2001. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.Fingerprint
Dive into the research topics of 'Edit detection and parsing for transcribed speech'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver