A correlation-based feature weighting filter for naive Bayes

Liangxiao Jiang, Lungan Zhang, Chaoqun Li, Jia Wu

Research output: Contribution to journalArticlepeer-review

160 Citations (Scopus)


Due to its simplicity, efficiency, and efficacy, naive Bayes (NB) has continued to be one of the top 10 algorithms in the data mining and machine learning community. Of numerous approaches to alleviating its conditional independence assumption, feature weighting has placed more emphasis on highly predictive features than those that are less predictive. In this paper, we argue that for NB highly predictive features should be highly correlated with the class (maximum mutual relevance), yet uncorrelated with other features (minimum mutual redundancy). Based on this premise, we propose a correlation-based feature weighting (CFW) filter for NB. In CFW, the weight for a feature is a sigmoid transformation of the difference between the feature-class correlation (mutual relevance) and the average feature-feature intercorrelation (average mutual redundancy). Experimental results show that NB with CFW significantly outperforms NB and all the other existing state-of-the-art feature weighting filters used to compare. Compared to feature weighting wrappers for improving NB, the main advantages of CFW are its low computational complexity (no search involved) and the fact that it maintains the simplicity of the final model. Besides, we apply CFW to text classification and have achieved remarkable improvements.

Original languageEnglish
Article number8359364
Pages (from-to)201-213
Number of pages13
JournalIEEE Transactions on Knowledge and Data Engineering
Issue number2
Publication statusPublished - 1 Feb 2019


  • Correlation
  • correlation
  • Decision trees
  • Electronic mail
  • Feature extraction
  • feature weighting
  • Mathematical model
  • mutual information
  • mutual redundancy
  • mutual relevance
  • naive Bayes
  • Redundancy
  • Training


Dive into the research topics of 'A correlation-based feature weighting filter for naive Bayes'. Together they form a unique fingerprint.

Cite this