Mining top-k minimal redundancy frequent patterns over uncertain databases

Haishuai Wang, Peng Zhang, Jia Wu*, Shirui Pan

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

Abstract

Frequent pattern mining from uncertain data has been paid closed attention due to most of the real life databases contain data with uncertainty. Several approaches have been proposed for mining high significance frequent itemsets over uncertain data, however, previous algorithms yield many redundant frequent itemsets and require to set an appropriate user specified threshold which is difficult for users. In this paper, we formally define the problem of top-k minimal redundancy probabilistic frequent pattern mining, which targets to identify top-k patterns with high-significance and low-redundancy simultaneously from uncertain data. We first design uncertain pattern correlation based on Pearson correlation coefficient, which considers pattern uncertainty. Moreover, we present a new algorithm, UTFP, to mine top-k minimal redundancy frequent patterns of length no less than minimum length min_l without setting threshold. We further propose a set of strategies to prune and reduce search space. Experimental results demonstrate that the proposed algorithm achieves good performance in terms of finding top-k frequent patterns with low redundancy on probabilistic data. Our method represents the first research endeavor for probabilistic data based top-k correlated pattern mining.

Original languageEnglish
Title of host publicationNeural Information Processing
Subtitle of host publication22nd International Conference, ICONIP 2015, Proceedings, Part IV
EditorsSabri Arik, Tingwen Huang, Weng Kin Lai, Qingshan Liu
Place of PublicationCham
PublisherSpringer-VDI-Verlag GmbH & Co. KG
Pages111-119
Number of pages9
ISBN (Electronic)9783319265612
ISBN (Print)9783319265605
DOIs
Publication statusPublished - 1 Jan 2015
Externally publishedYes
Event22nd International Conference on Neural Information Processing, ICONIP 2015 - Istanbul, Turkey
Duration: 9 Nov 201512 Nov 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9492
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference22nd International Conference on Neural Information Processing, ICONIP 2015
Country/TerritoryTurkey
CityIstanbul
Period9/11/1512/11/15

Keywords

  • Frequent patterns
  • Redundancy
  • Top-k
  • Uncertain

Fingerprint

Dive into the research topics of 'Mining top-k minimal redundancy frequent patterns over uncertain databases'. Together they form a unique fingerprint.

Cite this