A novel consistent random forest framework

Bernoulli random forests

Yisen Wang, Shu-Tao Xia, Qingtao Tang, Jia Wu, Xingquan Zhu

Research output: Contribution to journalArticle

13 Citations (Scopus)

Abstract

Random forests (RFs) are recognized as one type of ensemble learning method and are effective for the most classification and regression tasks. Despite their impressive empirical performance, the theory of RFs has yet been fully proved. Several theoretically guaranteed RF variants have been presented, but their poor practical performance has been criticized. In this paper, a novel RF framework is proposed, named Bernoulli RFs (BRFs), with the aim of solving the RF dilemma between theoretical consistency and empirical performance. BRF uses two independent Bernoulli distributions to simplify the tree construction, in contrast to the RFs proposed by Breiman. The two Bernoulli distributions are separately used to control the splitting feature and splitting point selection processes of tree construction. Consequently, theoretical consistency is ensured in BRF, i.e., the convergence of learning performance to optimum will be guaranteed when infinite data are given. Importantly, our proposed BRF is consistent for both classification and regression. The best empirical performance is achieved by BRF when it is compared with state-of-the-art theoretical/consistent RFs. This advance in RF research toward closing the gap between theory and practice is verified by the theoretical and experimental studies in this paper.

Original languageEnglish
Pages (from-to)3510-3523
Number of pages14
JournalIEEE Transactions on Neural Networks and Learning Systems
Volume29
Issue number8
Early online date15 Aug 2017
DOIs
Publication statusPublished - Aug 2018

Keywords

  • random forests (RFs)
  • Classification
  • consistency
  • regression

Cite this