In this paper we describe machine learning experiments that aim to characterise the content selection process for distinguishing descriptions. Our experiments are based on two large corpora of human-produced descriptions of objects in relatively small visual scenes; the referring expressions are annotated with their semantic content. The visual context of reference is widely considered to be a primary determinant of content in referring expression generation, so we explore whether a model can be trained to predict the collection of descriptive attributes that should be used in a given situation. Our experiments demonstrate that speaker-specific preferences play a much more important role than existing approaches to referring expression generation acknowledge.
|Number of pages
|Australasian Language Technology Workshop 2010 : proceedings of the workshop
|Published - 2010
|Australasian Language Technology Workshop (8th : 2010) - Melbourne, Australia
Duration: 9 Dec 2010 → 10 Dec 2010