Building an audio-visual corpus of Australian English

large corpus collection with an economical portable and replicable Black Box

Denis Burnham*, Dominique Estival, Steven Fazio, Jette Viethen, Felicity Cox, Robert Dale, Steve Cassidy, Julien Epps, Roberto Togneri, Michael Wagner, Yuko Kinoshita, Roland Goecke, Joanne Arciuli, Marc Onslow, Trent Lewis, Andy Butcher, John Hajek

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contribution

26 Citations (Scopus)

Abstract

The Big Australian Speech Corpus project incorporates the strategic goals of 30 Chief Investigators from various speech science areas. Speech from 1000 geographically and socially diverse speakers is being recorded using a uniform and automated protocol plus standardized hardware and software to produce a widely applicable and extensible database - AusTalk. Here we describe the project's major components and organization; share the lessons learnt from difficulties and challenges; and present the results achieved so far.

Original languageEnglish
Title of host publication12th annual conference of the international speech communication association 2011 (interspeech 2011), Vols 1-5
EditorsPiero Cosi, Renato De Mori, Giuseppe Di Fabbrizio, Roberto Pieraccini
Place of PublicationBaixas, France
PublisherISCA-INT SPEECH COMMUNICATION ASSOC
Pages848-851
Number of pages4
ISBN (Print)9781618392701
Publication statusPublished - 2011
Event12th Annual Conference of the International-Speech-Communication-Association 2011 (INTERSPEECH 2011) - Florence, Italy
Duration: 27 Aug 201131 Aug 2011

Conference

Conference12th Annual Conference of the International-Speech-Communication-Association 2011 (INTERSPEECH 2011)
CountryItaly
CityFlorence
Period27/08/1131/08/11

    Fingerprint

Keywords

  • speech corpus
  • AV data
  • Australian English

Cite this

Burnham, D., Estival, D., Fazio, S., Viethen, J., Cox, F., Dale, R., ... Hajek, J. (2011). Building an audio-visual corpus of Australian English: large corpus collection with an economical portable and replicable Black Box. In P. Cosi, R. De Mori, G. Di Fabbrizio, & R. Pieraccini (Eds.), 12th annual conference of the international speech communication association 2011 (interspeech 2011), Vols 1-5 (pp. 848-851). Baixas, France: ISCA-INT SPEECH COMMUNICATION ASSOC.