Active learning and the Irish treebank

Teresa Lynn, Jennifer Foster, Mark Dras, Elaine Uí Dhonnchadha

Research output: Contribution to journalConference paperpeer-review

Abstract

We report on our ongoing work in developing the Irish Dependency Treebank, describe the results of two Inter-annotator Agreement (IAA) studies, demonstrate improvements in annotation consistency which have a knock-on effect on parsing accuracy, and present the final set of dependency labels. We then go on to investigate the extent to which active learning can play a role in treebank and parser development by comparing an active learning bootstrapping approach to a passive approach in which sentences are chosen at random for manual revision. We show that active learning outperforms passive learning, but when annotation effort is taken into account, it is not clear how much of an advantage the active learning approach has. Finally, we present results which suggest that adding automatic parses to the training data along with manually revised parses in an active learning setup does not greatly affect parsing accuracy.
Original languageEnglish
Pages (from-to)23-32
Number of pages10
JournalProceedings of the Australasian Language Technology Association Workshop 2012 : ALTA 2012
Volume10
Publication statusPublished - 2012
EventAustralasian Language Technology Workshop (10th : 2012) - Dunedin, New Zealand
Duration: 4 Dec 20126 Dec 2012

Fingerprint

Dive into the research topics of 'Active learning and the Irish treebank'. Together they form a unique fingerprint.

Cite this