TY - GEN
T1 - Neural constituency parsing of speech transcripts
AU - Jamshid Lou, Paria
AU - Wang, Yufei
AU - Johnson, Mark
PY - 2019/6
Y1 - 2019/6
N2 - This paper studies the performance of a neural self-attentive parser on transcribed speech. Speech presents parsing challenges that do not appear in written text, such as the lack of punctuation and the presence of speech disfluencies (including filled pauses, repetitions, corrections, etc.). Disfluencies are especially problematic for conventional syntactic parsers, which typically fail to find any EDITED disfluency nodes at all. This motivated the development of special disfluency detection systems, and special mechanisms added to parsers specifically to handle disfluencies. However, we show here that neural parsers can find EDITED disfluency nodes, and the best neural parsers find them with an accuracy surpassing that of specialized disfluency detection systems, thus making these specialized mechanisms unnecessary. This paper also investigates a modified loss function that puts more weight on EDITED nodes. It also describes tree-transformations that simplify the disfluency detection task by providing alternative encodings of disfluencies and syntactic information.
AB - This paper studies the performance of a neural self-attentive parser on transcribed speech. Speech presents parsing challenges that do not appear in written text, such as the lack of punctuation and the presence of speech disfluencies (including filled pauses, repetitions, corrections, etc.). Disfluencies are especially problematic for conventional syntactic parsers, which typically fail to find any EDITED disfluency nodes at all. This motivated the development of special disfluency detection systems, and special mechanisms added to parsers specifically to handle disfluencies. However, we show here that neural parsers can find EDITED disfluency nodes, and the best neural parsers find them with an accuracy surpassing that of specialized disfluency detection systems, thus making these specialized mechanisms unnecessary. This paper also investigates a modified loss function that puts more weight on EDITED nodes. It also describes tree-transformations that simplify the disfluency detection task by providing alternative encodings of disfluencies and syntactic information.
UR - http://www.scopus.com/inward/record.url?scp=85085584640&partnerID=8YFLogxK
M3 - Conference proceeding contribution
T3 - NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference
SP - 2756
EP - 2765
BT - The 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies
A2 - Burstein, Jill
A2 - Doran, Christy
A2 - Solorio, Thamar
PB - Association for Computational Linguistics (ACL)
CY - Stroudsburg PA
T2 - 2019 Annual Conference of the North American Chapter of the Association for Computational Linguistics
Y2 - 2 June 2019 through 7 June 2019
ER -