Abstract
This paper describes a stemming technique that depends principally on a target language's lexicon, organised as an automaton of word strings. The clear distinction between the lexicon and the procedure itself allows the stemmer to be customised for any language with little or even no changes to the program's source code. An implementation of the stemmer, with a medium sized Portuguese lexicon is evaluated using Paice's [16] evaluation method.
Original language | English |
---|---|
Pages (from-to) | 266-276 |
Number of pages | 11 |
Journal | Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) |
Volume | 2857 |
Publication status | Published - 2003 |