Abstract
This paper presents a novel two-stage information filtering model which combines the merits of term-based and pattern-based approaches to effectively filter sheer volume of information. In particular, the first filtering stage is supported by a novel rough analysis model which efficiently removes a large number of irrelevant documents, thereby addressing the overload problem. The second filtering stage is empowered by a semantically rich pattern taxonomy mining model which effectively fetches incoming documents according to the specific information needs of a user, thereby addressing the mismatch problem. The experimental results based on the RCV1 corpus show that the proposed two-stage filtering model significantly outperforms other types of "two-stage" information filtering models.
Original language | English |
---|---|
Title of host publication | Proceedings of the 19th ACM international conference on Information and knowledge management |
Place of Publication | New York, N.Y. |
Publisher | ACM |
Pages | 1429-1432 |
Number of pages | 4 |
ISBN (Print) | 9781450300995 |
DOIs | |
Publication status | Published - 2010 |
Externally published | Yes |
Event | CIKM '10 : International Conference on Information and Knowledge Management (19th : 2010) - Toronto, ON, Canada Duration: 26 Oct 2010 → 30 Oct 2010 |
Conference
Conference | CIKM '10 : International Conference on Information and Knowledge Management (19th : 2010) |
---|---|
City | Toronto, ON, Canada |
Period | 26/10/10 → 30/10/10 |
Keywords
- decision
- experimentation
- theory