BayesTH-MCRDR algorithm for automatic classification of Web document

Woo Chul Cho*, Debbie Richards

*Corresponding author for this work

Research output: Contribution to journalArticle

1 Citation (Scopus)

Abstract

Nowadays, automated Web document classification is considered as an important method to manage and process an enormous amount of Web documents in digital forms that are extensive and constantly increasing. Recently, document classification has been addressed with various classified techniques such as naïve Bayesian, TFIDF (Term Frequency Inverse Document Frequency), FCA (Formal Concept Analysis) and MCRDR (Multiple Classification Ripple Down Rules). We suggest the BayesTH-MCRDR algorithm for useful new Web document classification in this paper. We offer a composite algorithm that combines a naïve Bayesian algorithm using Threshold and the MCRDR algorithm. The prominent feature of the BayesTH-MCRDR algorithm is optimisation of the initial relationship between keywords before final assignment to a category in order to get higher document classification accuracy. We also present the system we have developed in order to demonstrate and compare a number of classification techniques.

Original languageEnglish
Pages (from-to)344-356
Number of pages13
JournalLecture Notes in Artificial Intelligence (Subseries of Lecture Notes in Computer Science)
Volume3339
Publication statusPublished - 2004

Fingerprint Dive into the research topics of 'BayesTH-MCRDR algorithm for automatic classification of Web document'. Together they form a unique fingerprint.

Cite this