Audio-visual influences on speech perception: a comparison of sung and spoken conditions

Lena Quinto, William F. Thompson, Frank A. Russo

    Research output: Contribution to conferenceAbstract


    Visual cues inform speech perception and may also influence the perception of lyrics in song. The importance of visual cues to speech perception is seen in the McGurk effect, in which pairing incongruent multimodal stimuli can produce intermediate syllables, e.g., visual ga with auditory ba leads to perceived da. Music and language share common links such as prosodic and rhythmic cues, but speech is specialized for verbal communication whereas music is not. Thus, we were unsure if a McGurk effect would occur for sung materials. Twenty-nine participants heard sequences of syllables (la-la-la-ba, la-la-la-ga) that were spoken or sung to a steady beat. Sung stimuli were ascending triads that returned to the original tonic or a semitone above the tonic. Incongruent stimuli were created by mixing an auditory ba with a visual ga. Signal-to-noise ratio (SNR) was manipulated to assess a possible trade-off between auditory and visual signals. The signal level across conditions was 60 dB SPL and SNR was varied across conditions: 60 dB (easy), 0 dB (moderate), and -12 dB (difficult). Participants chose the syllable they last heard from the following syllables: ba, da, ga, la, tha and va. Effects of SNR indicated that judgments of syllable perception relied more heavily on visual cues with decreasing SNR. When data were analyzed for spoken and sung stimuli separately, a congruency (McGurk) effect was observed for both domains. This was qualified by a domain x congruency interaction that indicated differences in the nature of the McGurk effect for spoken and sung domains. In particular, the influence of visual cues was greater when syllables were sung than spoken, especially if they ended on the (unexpected) raised tonic. The findings confirm that cross-modal integration of syllables occurs for sung materials, and that visual cues may be especially important for deciphering lyrics.
    Original languageEnglish
    Number of pages1
    Publication statusPublished - 2007
    EventInternational Conference on Music Communication Science (ICOMCS) - Sydney
    Duration: 5 Dec 20077 Dec 2007


    ConferenceInternational Conference on Music Communication Science (ICOMCS)


    Dive into the research topics of 'Audio-visual influences on speech perception: a comparison of sung and spoken conditions'. Together they form a unique fingerprint.

    Cite this