Neural constituency parsing of speech transcripts

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionResearchpeer-review

Abstract

This paper studies the performance of a neural self-attentive parser on transcribed speech. Speech presents parsing challenges that do not appear in written text, such as the lack of punctuation and the presence of speech disfluencies (including filled pauses, repetitions, corrections, etc.). Disfluencies are especially problematic for conventional syntactic parsers, which typically fail to find any EDITED disfluency nodes at all. This motivated the development of special disfluency detection systems, and special mechanisms added to parsers specifically to handle disfluencies. However, we show here that neural parsers can find EDITED disfluency nodes, and the best neural parsers find them with an accuracy surpassing that of specialized disfluency detection systems, thus making these specialized mechanisms unnecessary. This paper also investigates a modified loss function that puts more weight on EDITED nodes. It also describes tree-transformations that simplify the disfluency detection task by providing alternative encodings of disfluencies and syntactic information.
LanguageEnglish
Title of host publicationThe 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
Subtitle of host publicationProceedings of the Conference Vol. 1 (Long and Short Papers)
EditorsJill Burstein, Christy Doran, Thamar Solorio
PublisherAssociation for Computational Linguistics (ACL)
Pages2756-2765
Number of pages10
ISBN (Electronic)9781950737130
Publication statusPublished - Jun 2019
Event2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics - Minneapolis, US
Duration: 2 Jun 2019 → …

Conference

Conference2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Period2/06/19 → …

Fingerprint

Syntactics

Cite this

Jamshid Lou, P., Wang, Y., & Johnson, M. (2019). Neural constituency parsing of speech transcripts. In J. Burstein, C. Doran, & T. Solorio (Eds.), The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Proceedings of the Conference Vol. 1 (Long and Short Papers) (pp. 2756-2765). Association for Computational Linguistics (ACL).
Jamshid Lou, Paria ; Wang, Yufei ; Johnson, Mark. / Neural constituency parsing of speech transcripts. The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Proceedings of the Conference Vol. 1 (Long and Short Papers). editor / Jill Burstein ; Christy Doran ; Thamar Solorio. Association for Computational Linguistics (ACL), 2019. pp. 2756-2765
@inproceedings{7a1414d19f91445b8f90153292c4852e,
title = "Neural constituency parsing of speech transcripts",
abstract = "This paper studies the performance of a neural self-attentive parser on transcribed speech. Speech presents parsing challenges that do not appear in written text, such as the lack of punctuation and the presence of speech disfluencies (including filled pauses, repetitions, corrections, etc.). Disfluencies are especially problematic for conventional syntactic parsers, which typically fail to find any EDITED disfluency nodes at all. This motivated the development of special disfluency detection systems, and special mechanisms added to parsers specifically to handle disfluencies. However, we show here that neural parsers can find EDITED disfluency nodes, and the best neural parsers find them with an accuracy surpassing that of specialized disfluency detection systems, thus making these specialized mechanisms unnecessary. This paper also investigates a modified loss function that puts more weight on EDITED nodes. It also describes tree-transformations that simplify the disfluency detection task by providing alternative encodings of disfluencies and syntactic information.",
author = "{Jamshid Lou}, Paria and Yufei Wang and Mark Johnson",
year = "2019",
month = "6",
language = "English",
pages = "2756--2765",
editor = "Jill Burstein and Christy Doran and Thamar Solorio",
booktitle = "The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
publisher = "Association for Computational Linguistics (ACL)",

}

Jamshid Lou, P, Wang, Y & Johnson, M 2019, Neural constituency parsing of speech transcripts. in J Burstein, C Doran & T Solorio (eds), The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Proceedings of the Conference Vol. 1 (Long and Short Papers). Association for Computational Linguistics (ACL), pp. 2756-2765, 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics, 2/06/19.

Neural constituency parsing of speech transcripts. / Jamshid Lou, Paria; Wang, Yufei; Johnson, Mark.

The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Proceedings of the Conference Vol. 1 (Long and Short Papers). ed. / Jill Burstein; Christy Doran; Thamar Solorio. Association for Computational Linguistics (ACL), 2019. p. 2756-2765.

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionResearchpeer-review

TY - GEN

T1 - Neural constituency parsing of speech transcripts

AU - Jamshid Lou, Paria

AU - Wang, Yufei

AU - Johnson, Mark

PY - 2019/6

Y1 - 2019/6

N2 - This paper studies the performance of a neural self-attentive parser on transcribed speech. Speech presents parsing challenges that do not appear in written text, such as the lack of punctuation and the presence of speech disfluencies (including filled pauses, repetitions, corrections, etc.). Disfluencies are especially problematic for conventional syntactic parsers, which typically fail to find any EDITED disfluency nodes at all. This motivated the development of special disfluency detection systems, and special mechanisms added to parsers specifically to handle disfluencies. However, we show here that neural parsers can find EDITED disfluency nodes, and the best neural parsers find them with an accuracy surpassing that of specialized disfluency detection systems, thus making these specialized mechanisms unnecessary. This paper also investigates a modified loss function that puts more weight on EDITED nodes. It also describes tree-transformations that simplify the disfluency detection task by providing alternative encodings of disfluencies and syntactic information.

AB - This paper studies the performance of a neural self-attentive parser on transcribed speech. Speech presents parsing challenges that do not appear in written text, such as the lack of punctuation and the presence of speech disfluencies (including filled pauses, repetitions, corrections, etc.). Disfluencies are especially problematic for conventional syntactic parsers, which typically fail to find any EDITED disfluency nodes at all. This motivated the development of special disfluency detection systems, and special mechanisms added to parsers specifically to handle disfluencies. However, we show here that neural parsers can find EDITED disfluency nodes, and the best neural parsers find them with an accuracy surpassing that of specialized disfluency detection systems, thus making these specialized mechanisms unnecessary. This paper also investigates a modified loss function that puts more weight on EDITED nodes. It also describes tree-transformations that simplify the disfluency detection task by providing alternative encodings of disfluencies and syntactic information.

M3 - Conference proceeding contribution

SP - 2756

EP - 2765

BT - The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

A2 - Burstein, Jill

A2 - Doran, Christy

A2 - Solorio, Thamar

PB - Association for Computational Linguistics (ACL)

ER -

Jamshid Lou P, Wang Y, Johnson M. Neural constituency parsing of speech transcripts. In Burstein J, Doran C, Solorio T, editors, The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies: Proceedings of the Conference Vol. 1 (Long and Short Papers). Association for Computational Linguistics (ACL). 2019. p. 2756-2765