GLEU: automatic evaluation of sentence-level fluency

Andrew Mutton*, Mark Dras, Stephen Wan, Robert Dale

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

45 Citations (Scopus)


In evaluating the output of language technology applications-MT, natural language generation, summarisation-automatic evaluation techniques generally conflate measurement of faithfulness to source content with fluency of the resulting text. In this paper we develop an automatic evaluation metric to estimate fluency alone, by examining the use of parser outputs as metrics, and show that they correlate with human judgements of generated text fluency. We then develop a machine learner based on these, and show that this performs better than the individual parser metrics, approaching a lower bound on human performance. We finally look at different language models for generating sentences, and show that while individual parser metrics can be 'fooled' depending on generation method, the machine learner provides a consistent estimator of fluency.

Original languageEnglish
Title of host publicationACL 2007 - Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics
Place of PublicationStroudsburg, PA
PublisherAssociation for Computational Linguistics (ACL)
Number of pages8
ISBN (Print)9781932432862
Publication statusPublished - 2007
Event45th Annual Meeting of the Association for Computational Linguistics, ACL 2007 - Prague, Czech Republic
Duration: 23 Jun 200730 Jun 2007


Other45th Annual Meeting of the Association for Computational Linguistics, ACL 2007
Country/TerritoryCzech Republic


Dive into the research topics of 'GLEU: automatic evaluation of sentence-level fluency'. Together they form a unique fingerprint.

Cite this