Sociophonetics of popular music: insights from corpus analysis and speech perception experiments

Research output: ThesisDoctoral Thesis


This thesis examines the flexibility and context-sensitivity of speech perception by look- ing at a domain not often explored in the study of language cognition — popular music. Three empirical studies are presented. The first examines the current state of sociolinguis- tic variation in commercial popular music, while the second and third explore everyday listeners’ perception of language in musical and non-musical contexts. The foundational assumption of the thesis is that the use of ‘American English’ in song is automatic for New Zealand singers, and constitutes a responsive style that is both accurate and consistent. The use of New Zealand English in song, by contrast, is stylised, involving an initiative act of identity and requiring effort and awareness. This will be discussed in Chapter 1, where I also introduce the term Standard Popular Music Singing Style (SPMSS) to refer to the US English-derived phonetic style dominant in popular song.
The first empirical study will be presented in Chapter 2. Using a systematically selected corpus of commercial pop and hip hop from NZ and the USA, analysis of non-prevocalic and linking /r/, and the vowels of the bath, lot and goat lexical sets confirm that SPMSS is highly normative in NZ music. Most pop singers closely follow US patterns, while several hip hop artists display elements of New Zealand English. This reflects the value placed on authenticity in hip hop, and also interacts with ethnicity, showing the use of different authentication practices by P ̄akeh ̄a (NZ European) and M ̄aori/Pasifika artists. By looking at co-variation amongst the variables, I explore both the apparent identity goals of the artists, and the relative salience of the variables. Chapters 3 and 4 use the results of the corpus analysis to explore how the dominance of SPMSS affects speech processing.
The first of the two perception experiments is a phonetic categorisation task. Listeners decide whether they hear the word bed or bad in a condition where the stimuli are either set to music, or appear in one of two non-musical control conditions. The stimuli are on a resynthesised continuum between the dress and trap vowels, passing through an F1 space where the vowel is ambiguous and could either be perceived as a spoken NZE trap or a sung dress. When set to music, the NZ listeners perceive the vowel according to expectations of SPMSS (i.e. expecting US-derived vowel qualities). The second perception experiment is a lexical decision task that uses the natural speech of a NZ and a US speaker, once again in musical and non-musical conditions. Participants’ processing of the US voice is facilitated in the music condition, becoming faster than reaction times to their native dialect.
Bringing the results of the corpus and perception studies together, this thesis shows that SPMSS is highly normative in NZ popular music not just for performers, but also in the minds of the general music-listening public. I argue that many New Zealanders are bidialectal, with native-like knowledge of SPMSS. Speech and song are two highly distinct and perceptually contrastive contexts of language use. By differing from conversational language across a range of perceptual and cognitive dimensions, language heard or produced in song is likely to encode and activate a distinct subset of auditory memories. The contextual specificity of such networks may then allow for the abstraction of an independent sub-system of sociophonetic knowledge specific to the musical context.
Original languageEnglish
QualificationDoctor of Philosophy
Awarding Institution
  • University of Canterbury
  • Hay, Jennifer, Supervisor, External person
  • Clark, Lynn, Supervisor, External person
  • Theys, Catherine, Supervisor, External person
Award date9 Apr 2020
Publication statusUnpublished - 2019
Externally publishedYes


Dive into the research topics of 'Sociophonetics of popular music: insights from corpus analysis and speech perception experiments'. Together they form a unique fingerprint.

Cite this