Data-dependent Rectangular Bounding Processes

Xuhui Fan, Bin Li*, Prosha A. Rahman, Scott A. Sisson

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Stochastic partition processes divide a multi-dimensional space into a number of regions, such that the data within each region exhibit some form of homogeneity. Due to the nature of their partition strategies, partition processes can often create many unnecessary divisions in sparse regions when trying to describe data in dense regions. To avoid this problem we introduce a parsimonious partition model – the Rectangular Bounding Process (RBP) – to efficiently partition multi-dimensional spaces, by employing a bounding strategy to enclose data points within rectangular bounding boxes. The RBP is self-consistent and as such can be directly extended from a finite hypercube to an infinite (unbounded) space. We extend the RBP to establish a data-dependent RBP (data-RBP) to generate bounding boxes only over existing data points in a sequential manner, which can effectively reduce model complexity and enable online learning. To achieve this, we design an alternative way to generate bounding boxes and prove the distributional equivalence between the data-RBP and the RBP when empty boxes are removed. We demonstrate application of the RBP and the data-RBP in three scenarios: regression trees, relational modelling, and random feature construction for online learning. Extensive experimental results validate the performance of the RBP and the data-RBP for both accuracy and efficiency.

Original languageEnglish
Number of pages18
JournalIEEE Transactions on Pattern Analysis and Machine Intelligence
Early online date6 Oct 2025
DOIs
Publication statusE-pub ahead of print - 6 Oct 2025

Fingerprint

Dive into the research topics of 'Data-dependent Rectangular Bounding Processes'. Together they form a unique fingerprint.

Cite this