Abstract
Near-synonyms are words that mean approximately the same thing, and which tend to be assigned to the same leaf in ontologies such as WordNet. However, they can differ from each other subtly in both meaning and usage—consider the pair of near-synonyms frugal and stingy—and therefore choosing the appropriate near-synonym for a given context is not a trivial problem. Early work on near-synonyms was that of Edmonds (1997). Edmonds reported an experiment attempting to predict which of a set of near-synonyms would be used in a given context using lexical co-occurrence
networks. The conclusion of this work was that corpus statistics approaches did
not appear to work well for this type of problem and led instead to the development
of machine learning approaches over lexical resources such as Choose the Right
Word (Hayakawa, 1994). Our hypothesis is that some kind of corpus statistics approach may still be effective in some situations: particularly if the nearsynonyms
differ in sentiment from each other. Intuition based on work in sentiment
analysis suggests that if the distribution of words embodying some characteristic
of sentiment can predict the overall sentiment or attitude of a document, perhaps these same words can predict the choice of an individual ‘attitudinal’ nearsynonym
given its context, while this is not necessarily true for other types of nearsynonym.
This would again open up problems involving this type of near-synonym to corpus statistics methods. As a first step, then, we investigate whether attitudinal
near-synonyms are more likely to be successfully predicted by a corpus statistics
method than other types. In this paper we present a larger-scale experiment based
on Edmonds (1997), and show that attitudinal near-synonyms can in fact be predicted
more accurately using corpus statistics methods.
Original language | English |
---|---|
Title of host publication | PACLING '07 |
Subtitle of host publication | proceedings of the conference Pacific Association for Computational Linguistics ; 19-21 September 2007 University of Melbourne, Melbourne, Australia |
Place of Publication | Melbourne |
Publisher | Pacific Association for Computational Linguistics |
Pages | 31-39 |
Number of pages | 9 |
Publication status | Published - 2007 |
Event | Conference of the Pacific Association for Computational Linguistics (10th : 2007) - Melbourne Duration: 19 Sept 2007 → 21 Sept 2007 |
Conference
Conference | Conference of the Pacific Association for Computational Linguistics (10th : 2007) |
---|---|
City | Melbourne |
Period | 19/09/07 → 21/09/07 |