Word boundary detection in broad class and phoneme strings

Jonathan Harrington*, Gordon Watson, Maggie Cooper

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    23 Citations (Scopus)

    Abstract

    This paper explores the number of word boundaries which can be detected from sequences of phonemes and broad classes in continuous speech transcriptions. In the first part of the paper, word boundaries are detected from sequences of three phonemes which occur across word boundaries but which are excluded word internally. When such sequences are matched against phonemic transcriptions of 145 utterances, it is shown that around 37% of all word boundaries can be correctly identified. When the same transcriptions are represented by broad classes rather than phonemes, a knowledge of sequences which span word boundaries but which do not occur word internally is almost completely ineffective for the purpose of word boundary detection. Instead, it is shown that a version of the model discussed in Cutler & Norris 1988 based on the distinction between "strong" and "weak" vowels enables over 40% of word boundaries to be correctly located at the broad class level although many word boundaries are also inserted at inappropriate points. The implications of these kinds of word boundary detection strategies for models of lexical access in a continuous speech recognizer are also discussed.

    Original languageEnglish
    Pages (from-to)367-382
    Number of pages16
    JournalComputer Speech and Language
    Volume3
    Issue number4
    DOIs
    Publication statusPublished - 1989

    Fingerprint

    Dive into the research topics of 'Word boundary detection in broad class and phoneme strings'. Together they form a unique fingerprint.

    Cite this