Abstract
Self-attentive neural syntactic parsers using contextualized word embeddings (e.g. ELMo or BERT) currently produce state-of-the-art results in joint parsing and disfluency detection in speech transcripts. Since the contextualized word embeddings are pre-trained on a large amount of unlabeled data, using additional unlabeled data to train a neural model might seem redundant. However, we show that self-training - a semi-supervised technique for incorporating unlabeled data - sets a new state-of-the-art for the self-attentive parser on disfluency detection, demonstrating that self-training provides benefits orthogonal to the pre-trained contextualized word representations. We also show that ensembling self-trained parsers provides further gains for disfluency detection.
| Original language | English |
|---|---|
| Title of host publication | The 58th Annual Meeting of the Association for Computational Linguistics |
| Subtitle of host publication | Proceedings of the Conference |
| Place of Publication | Stroudsburg, PA |
| Publisher | Association for Computational Linguistics (ACL) |
| Pages | 3754-3763 |
| Number of pages | 10 |
| ISBN (Print) | 9781952148255 |
| Publication status | Published - 2020 |
| Event | 58th Annual Meeting of the Association for Computational Linguistics (ACL) - Duration: 5 Jul 2020 → 10 Jul 2020 |
Publication series
| Name | Proceedings of the Annual Meeting of the Association for Computational Linguistics |
|---|---|
| ISSN (Print) | 0736-587X |
Conference
| Conference | 58th Annual Meeting of the Association for Computational Linguistics (ACL) |
|---|---|
| Period | 5/07/20 → 10/07/20 |
Bibliographical note
Version archived for private and non-commercial use with the permission of the author/s and according to publisher conditions. For further rights please contact the publisher.Fingerprint
Dive into the research topics of 'Improving disfluency detection by self-training a self-attentive model'. Together they form a unique fingerprint.Projects
- 1 Finished
-
Improved syntactic and semantic analysis for natural language processing
Johnson, M. (Primary Chief Investigator) & Steedman, M. (Chief Investigator)
30/06/16 → 31/12/21
Project: Research
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver