Word boundary detection in broad class and phoneme strings

Jonathan Harrington*, Gordon Watson, Maggie Cooper

*Corresponding author for this work

Research output: Contribution to journalArticle

21 Citations (Scopus)

Abstract

This paper explores the number of word boundaries which can be detected from sequences of phonemes and broad classes in continuous speech transcriptions. In the first part of the paper, word boundaries are detected from sequences of three phonemes which occur across word boundaries but which are excluded word internally. When such sequences are matched against phonemic transcriptions of 145 utterances, it is shown that around 37% of all word boundaries can be correctly identified. When the same transcriptions are represented by broad classes rather than phonemes, a knowledge of sequences which span word boundaries but which do not occur word internally is almost completely ineffective for the purpose of word boundary detection. Instead, it is shown that a version of the model discussed in Cutler & Norris 1988 based on the distinction between "strong" and "weak" vowels enables over 40% of word boundaries to be correctly located at the broad class level although many word boundaries are also inserted at inappropriate points. The implications of these kinds of word boundary detection strategies for models of lexical access in a continuous speech recognizer are also discussed.

Original languageEnglish
Pages (from-to)367-382
Number of pages16
JournalComputer Speech and Language
Volume3
Issue number4
DOIs
Publication statusPublished - 1989

Fingerprint Dive into the research topics of 'Word boundary detection in broad class and phoneme strings'. Together they form a unique fingerprint.

  • Cite this