TY - JOUR
T1 - Word boundary detection in broad class and phoneme strings
AU - Harrington, Jonathan
AU - Watson, Gordon
AU - Cooper, Maggie
PY - 1989
Y1 - 1989
N2 - This paper explores the number of word boundaries which can be detected from sequences of phonemes and broad classes in continuous speech transcriptions. In the first part of the paper, word boundaries are detected from sequences of three phonemes which occur across word boundaries but which are excluded word internally. When such sequences are matched against phonemic transcriptions of 145 utterances, it is shown that around 37% of all word boundaries can be correctly identified. When the same transcriptions are represented by broad classes rather than phonemes, a knowledge of sequences which span word boundaries but which do not occur word internally is almost completely ineffective for the purpose of word boundary detection. Instead, it is shown that a version of the model discussed in Cutler & Norris 1988 based on the distinction between "strong" and "weak" vowels enables over 40% of word boundaries to be correctly located at the broad class level although many word boundaries are also inserted at inappropriate points. The implications of these kinds of word boundary detection strategies for models of lexical access in a continuous speech recognizer are also discussed.
AB - This paper explores the number of word boundaries which can be detected from sequences of phonemes and broad classes in continuous speech transcriptions. In the first part of the paper, word boundaries are detected from sequences of three phonemes which occur across word boundaries but which are excluded word internally. When such sequences are matched against phonemic transcriptions of 145 utterances, it is shown that around 37% of all word boundaries can be correctly identified. When the same transcriptions are represented by broad classes rather than phonemes, a knowledge of sequences which span word boundaries but which do not occur word internally is almost completely ineffective for the purpose of word boundary detection. Instead, it is shown that a version of the model discussed in Cutler & Norris 1988 based on the distinction between "strong" and "weak" vowels enables over 40% of word boundaries to be correctly located at the broad class level although many word boundaries are also inserted at inappropriate points. The implications of these kinds of word boundary detection strategies for models of lexical access in a continuous speech recognizer are also discussed.
UR - http://www.scopus.com/inward/record.url?scp=0008014138&partnerID=8YFLogxK
U2 - 10.1016/0885-2308(89)90004-1
DO - 10.1016/0885-2308(89)90004-1
M3 - Article
AN - SCOPUS:0008014138
SN - 0885-2308
VL - 3
SP - 367
EP - 382
JO - Computer Speech and Language
JF - Computer Speech and Language
IS - 4
ER -