From treebank conversion to automatic dependency parsing for Vietnamese

Dat Quoc Nguyen, Dai Quoc Nguyen, Son Bao Pham, Phuong-Thai Nguyen, Minh Le Nguyen

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contribution

17 Citations (Scopus)

Abstract

This paper presents a new conversion method to automatically transform a constituent-based Vietnamese Treebank into dependency trees. On a dependency Treebank created according to our new approach, we examine two state-of-the-art dependency parsers: the MSTParser and the MaltParser. Experiments show that the MSTParser outperforms the MaltParser. To the best of our knowledge, we report the highest performances published to date in the task of dependency parsing for Vietnamese. Particularly, on gold standard POS tags, we get an unlabeled attachment score of 79.08% and a labeled attachment score of 71.66%.

Original languageEnglish
Title of host publicationNatural language processing and information systems
Subtitle of host publication19th International Conference on Applications of Natural Language to Information Systems, NLDB 2014, proceedings
EditorsElisabeth Métais, Mathieu Roche, Maguelonne Teisseire
Place of PublicationSwitzerland
PublisherSpringer, Springer Nature
Pages196-207
Number of pages12
ISBN (Electronic)9783319079837
ISBN (Print)9783319079820
DOIs
Publication statusPublished - 2014
Externally publishedYes
Event19th International Conference on Applications of Natural Language to Information Systems, NLDB 2014 - Montpellier, France
Duration: 18 Jun 201420 Jun 2014

Publication series

NameLecture Notes in Computer Science
PublisherSpringer International Publishing
Volume8455
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other19th International Conference on Applications of Natural Language to Information Systems, NLDB 2014
CountryFrance
CityMontpellier
Period18/06/1420/06/14

Fingerprint Dive into the research topics of 'From treebank conversion to automatic dependency parsing for Vietnamese'. Together they form a unique fingerprint.

Cite this