We present a simple architecture for parsing transcribed speech in which an edited-word detector first removes such words from the sentence string, and then a standard statistical parser trained on transcribed speech parses the remaining words. The edit detector achieves a misclassification rate on edited words of 2.2%. (The NULL-model, which marks everything as not edited, has an error rate of 5.9%.) To evaluate our parsing results we introduce a new evaluation metric, the purpose of which is to make evaluation of a parse tree relatively indifferent to the exact tree position of EDITED nodes. By this metric the parser achieves 85.3% precision and 86.5% recall.
|Title of host publication||Proceedings of the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics|
|Subtitle of host publication||NAACL 2001|
|Place of Publication||Stroudsburg, PA|
|Publisher||Association for Computational Linguistics (ACL)|
|Number of pages||9|
|Publication status||Published - 2001|
|Event||Meeting of the North American Chapter of the Association for Computational Linguistics (2nd : 2001) - Pittsburgh, United States|
Duration: 1 Jun 2001 → 7 Jun 2001
|Conference||Meeting of the North American Chapter of the Association for Computational Linguistics (2nd : 2001)|
|Abbreviated title||NAACL '01|
|Period||1/06/01 → 7/06/01|
Bibliographical noteCopyright the Publisher 2001. Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.
Charniak, E., & Johnson, M. (2001). Edit detection and parsing for transcribed speech. In Proceedings of the 2nd Meeting of the North American Chapter of the Association for Computational Linguistics: NAACL 2001 (pp. 118-126). Stroudsburg, PA: Association for Computational Linguistics (ACL). https://doi.org/10.3115/1073336.1073352