Abstract
A basic signed language (SL) corpus is created through primary processing of video recordings using multi--‐media annotation software. Primary processing entails the tokenization and identification of SL units. For the purposes of linguistic research a corpus also needs secondary processing. Secondary processing entails appending tags for specific linguistic features to primary annotations. I draw on the experience from the Auslan corpus project to describe how primary and secondary processing can be used in corpus- based SL research. In particular, I show how the tier structure of ELAN can be used to tag SL units in a variety of ways, and how this information can be used to glean new information from the corpus which can then be added as new annotations to the corpus. Value-adding by principled and systematic primary and secondary processing of digital recordings is thus not only essential for corpus creation (‘machine-readability’), it also enables further enriching of the corpus so that even more value can be extracted. I conclude by discussing the implications for annotation software and standardized annotation schemas used in the creation of SL corpora.
| Original language | English |
|---|---|
| Title of host publication | Seventh International Conference on Language Resources and Evaluation (LREC 2010) |
| Subtitle of host publication | proceedings |
| Editors | Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, Mike Rosner, Daniel Tapias |
| Place of Publication | Luxemburg |
| Publisher | European Language Resources Association |
| Pages | 137-142 |
| Number of pages | 6 |
| ISBN (Print) | 9782951740860 |
| Publication status | Published - 2010 |
| Event | International Conference on Language Resources and Evaluation (7th : 2010) - Valletta, Malta Duration: 17 May 2010 → 23 May 2010 |
Conference
| Conference | International Conference on Language Resources and Evaluation (7th : 2010) |
|---|---|
| City | Valletta, Malta |
| Period | 17/05/10 → 23/05/10 |