Abstract
The resolution of lexical ambiguity in machine translation systems often involves the automated, on-line selection of the correct sense of polysemous target words in the context of a clause, phrase or sentence. However, the performance of machine translation systems in emulating this aspect of human language processing has not been entirely successful, to the extent that resolution of entities and terms in natural language could be automated for open source intelligence analysis. Whilst some of these systems confine themselves to processing domain-specific knowledge (e.g., medical terminology), with some success, the popular general-purpose direct translation systems now freely available on the World Wide Web (WWW) are investigated for characteristic semantic processing errors in this study. A ubiquitous sentence ("The quick brown fox jumps over the lazy dog"), an equative metaphor, and a simile are translated into four romance and one Germanic language, with the translation then inverted back to English using the same translation system. It is found that in addition to expected differences in correctly mapping shades of meaning (e.g., "quick" is mapped to "fast"), some spatial meanings are incorrectly transformed, especially for verbs (e.g., "jumps over" becomes "branches over" or "jumps on"). The most serious error is the addition of extra semantic features to individual words, particularly features associated with nouns (e.g., the gender-neutral "fox" becomes the female "vixen"). The implications of these types of errors for the automatic translation of human language - with respect to semantic representation in open source intelligence - are discussed.
Original language | English |
---|---|
Title of host publication | Proceedings - 2012 3rd Cybercrime and Trustworthy Computing Workshop, CTC 2012 |
Publisher | Institute of Electrical and Electronics Engineers (IEEE) |
Pages | 14-18 |
Number of pages | 5 |
ISBN (Print) | 9780769549408 |
DOIs | |
Publication status | Published - 2013 |
Externally published | Yes |
Event | 2012 3rd Cybercrime and Trustworthy Computing Workshop, CTC 2012 - Ballarat, VIC, Australia Duration: 29 Oct 2012 → 30 Oct 2012 |
Other
Other | 2012 3rd Cybercrime and Trustworthy Computing Workshop, CTC 2012 |
---|---|
Country/Territory | Australia |
City | Ballarat, VIC |
Period | 29/10/12 → 30/10/12 |
Keywords
- metaphor
- Open source intelligence
- polysemy