Outlier detection in complex categorical data by modelling the feature value couplings

Guansong Pang, Longbing Cao, Ling Chen

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

Abstract

This paper introduces a novel unsupervised outlier detection method, namely Coupled Biased Random Walks (CBRW), for identifying outliers in categorical data with diversified frequency distributions and many noisy features. Existing pattern-based outlier detection methods are ineffective in handling such complex scenarios, as they misfit such data. CBRW estimates outlier scores of feature values by modelling feature value level couplings, which carry intrinsic data characteristics, via biased random walks to handle this complex data. The outlier scores of feature values can either measure the outlierness of an object or facilitate the existing methods as a feature weighting and selection indicator. Substantial experiments show that CBRW can not only detect outliers in complex data significantly better than the state-of-the-art methods, but also greatly improve the performance of existing methods on data sets with many noisy features.
Original languageEnglish
Title of host publicationProceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence
EditorsSubbarao Kambhampati
Place of PublicationCalifornia
PublisherInternational Joint Conferences on Artificial Intelligence
Pages1902–1908
Number of pages7
ISBN (Electronic)9781577357704, 9781577357711
Publication statusPublished - 2016
Externally publishedYes
Event25th International Joint Conference on Artificial Intelligence, IJCAI 2016 - New York, United States
Duration: 9 Jul 201615 Jul 2016

Conference

Conference25th International Joint Conference on Artificial Intelligence, IJCAI 2016
Country/TerritoryUnited States
CityNew York
Period9/07/1615/07/16

Fingerprint

Dive into the research topics of 'Outlier detection in complex categorical data by modelling the feature value couplings'. Together they form a unique fingerprint.

Cite this