What role should evaluation play in the development of natural language generation (NLG) techniques and systems? In this paper we describe what is involved in natural language generation, and survey how evaluation has figured in work in this area to date. We comment on the issues raised by this existing work and on how the problems of NLG evaluation are different from the problems of evaluating work in natural language understanding. The paper is concluded by suggesting a way forward by looking more closely at the component problems that are addressed in natural language generation research; a particular text generation application is examined and the issues that are raised in assessing its performance on a variety of dimensions are looked at.
|Number of pages||25|
|Journal||Computer Speech and Language|
|Publication status||Published - Oct 1998|