An ensemble approach for better truth discovery

Xiu Susie Fang*, Quan Z. Sheng, Xianzhi Wang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

3 Citations (Scopus)

Abstract

Truth discovery is a hot research topic in the Big Data era, with the goal of identifying true values from the conflicting data provided by multiple sources on the same data items. Previously, many methods have been proposed to tackle this issue. However, none of the existing methods is a clear winner that consistently outperforms the others due to the varied characteristics of different methods. In addition, in some cases, an improved method may not even beat its original version as a result of the bias introduced by limited ground truths or different features of the applied datasets. To realize an approach that achieves better and robust overall performance, we propose to fully leverage the advantages of existing methods by extracting truth from the prediction results of these existing truth discovery methods. In particular, we first distinguish between the single-truth and multi-truth discovery problems and formally define the ensemble truth discovery problem. Then, we analyze the feasibility of the ensemble approach, and derive two models, i.e., serial model and parallel model, to implement the approach, and to further tackle the above two types of truth discovery problems. Extensive experiments over three large real-world datasets and various synthetic datasets demonstrate the effectiveness of our approach.

Original languageEnglish
Title of host publicationAdvanced data mining and applications
Subtitle of host publication12th International Conference, ADMA 2016, Gold Coast, QLD, Australia, December 12–15, 2016, proceedings
EditorsJinyan Li, Xue Li, Shuliang Wang, Jianxin Li, Quan Z. Sheng
Place of PublicationCham
PublisherSpringer, Springer Nature
Pages298-311
Number of pages14
ISBN (Electronic)9783319495866
ISBN (Print)9783319495859
DOIs
Publication statusPublished - 2016
Externally publishedYes
Event12th International Conference on Advanced Data Mining and Applications, ADMA 2016 - Gold Coast, Australia
Duration: 12 Dec 201615 Dec 2016

Publication series

NameLecture Notes in Artificial Intelligence
PublisherSpringer
Volume10086
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other12th International Conference on Advanced Data Mining and Applications, ADMA 2016
Country/TerritoryAustralia
CityGold Coast
Period12/12/1615/12/16

Keywords

  • Big data
  • Ensemble approach
  • Multi-truths
  • Truth discovery

Fingerprint

Dive into the research topics of 'An ensemble approach for better truth discovery'. Together they form a unique fingerprint.

Cite this