Abstract
This paper presents a novel metadata extraction (MDE) system for automatically detecting edited words, fillers, and self-interruption points in conversational speech. Our edit word detection sub-system combines a Tree Adjoining Grammar (TAG) noisy channel model, a statistical syntactic language model, and a MaxEnt reranker. Hand-built, deterministic rules are used to detect fillers. Self-interruption points are explicitly determined by detected fillers and edited words. We have evaluated our system for these three tasks on two types of input: manually annotated words and automatically recognized speech-to-text tokens. In all six cases, our system has improved the state-of-the-art, as measured in a recent blind evaluation.
Original language | English |
---|---|
Title of host publication | Rich Transcription 2004 Fall workshop (RT-04F) |
Place of Publication | Palisades, NY |
Number of pages | 7 |
Publication status | Published - 2004 |
Externally published | Yes |
Event | Rich Transcription 2004 Fall workshop - Palisades, United States Duration: 7 Nov 2004 → 10 Nov 2004 |
Conference
Conference | Rich Transcription 2004 Fall workshop |
---|---|
Abbreviated title | RT-04F |
Country/Territory | United States |
City | Palisades |
Period | 7/11/04 → 10/11/04 |