The Australian National Corpus: national infrastructure for language resources

Steve Cassidy, Michael Haugh, Pam Peters, Mark Fallu

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

6 Citations (Scopus)

Abstract

The Australian National Corpus has been established in an effort to make currently scattered and relatively inaccessible data available to researchers through an online portal. In contrast to other national corpora, it is conceptualised as a linked collection of many existing and future language resources representing language use in Australia, unified through common technical standards. This approach allows us to bootstrap a significant collection and add value to existing resources by providing a unified, online tool-set to support research in a number of disciplines. This paper provides an outline of the technical platform being developed to support the corpus and a brief overview of some of the collections that form part of the initial version of the Australian National Corpus.
Original languageEnglish
Title of host publicationProceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12)
EditorsNicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis
PublisherEuropean Language Resources Association (ELRA)
Pages3295-3299
Number of pages5
ISBN (Print)9782951740877
Publication statusPublished - 2012
EventInternational Conference on Language Resources and Evaluation (8th : 2012) - Istanbul, Turkey
Duration: 23 May 201225 May 2012

Conference

ConferenceInternational Conference on Language Resources and Evaluation (8th : 2012)
CityIstanbul, Turkey
Period23/05/1225/05/12

Keywords

  • national corpus
  • annotation
  • meta-data

Fingerprint

Dive into the research topics of 'The Australian National Corpus: national infrastructure for language resources'. Together they form a unique fingerprint.

Cite this