From archive to corpus: transcription and annotation in the creation of signed language corpora

Trevor Johnston*

*Corresponding author for this work

    Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

    1 Citation (Scopus)

    Abstract

    The essential characteristic of a signed language corpus is that it has been annotated, and not, contrary to the practice of many signed language researchers, that it has been transcribed. Annotations are necessary for corpus-based investigations of signed or spoken languages. Multi-media annotation software can now be used to transform a recording into a machine-readable text without it first being necessary to transcribe the text, provided that linguistic units are uniquely identified and annotations subsequently appended to these units. These unique identifiers are here referred to as ID-glosses. The use of ID-glosses is only possible if a reference lexical database (i.e., dictionary) exists as the result of prior foundation research into the lexicon. In short, the creators of signed language corpora should prioritize annotation above transcription, and ensure that signs are identified using unique gloss-based annotations. Without this the whole rationale for corpus-creation is undermined.

    Original languageEnglish
    Title of host publicationProceedings of the 22nd Pacific Asia Conference on Language, Information and Computation, PACLIC 22
    Place of PublicationManila
    PublisherDe la Salle University
    Pages16-29
    Number of pages14
    Publication statusPublished - 2008
    Event22nd Pacific Asia Conference on Language, Information and Computation, PACLIC 22 - Cebu, Philippines
    Duration: 20 Nov 200822 Nov 2008

    Other

    Other22nd Pacific Asia Conference on Language, Information and Computation, PACLIC 22
    Country/TerritoryPhilippines
    CityCebu
    Period20/11/0822/11/08

    Keywords

    • Annotation
    • Auslan (Australian Sign Language)
    • Corpora
    • Corpus linguistics
    • Language documentation
    • Sign language

    Fingerprint

    Dive into the research topics of 'From archive to corpus: transcription and annotation in the creation of signed language corpora'. Together they form a unique fingerprint.

    Cite this