Generalizing truth discovery by incorporating multi-truth features

Xiu Susie Fang*, Xianzhi Wang, Quan Z. Sheng, Lina Yao

*Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Truth discovery is the fundamental technique for resolving the conflicts between the information provided by different data sources by detecting the true values. Traditional methods assume that each data item has only one true value and therefore cannot deal with the circumstances where one data item has multiple true values (i.e., multi-value truth). In this work, we target at this new challenge and propose a generalized Bayesian framework that comprehensively incorporates the features of multi-value truth for the accurate and efficient multi-source data integration. In particular, we identify three key features of multi-value truth, called source-value mapping, differentiated mutual exclusion, and complicated source dependency, to better solve the problem. In particular, sources and values are aggregated based on their mapping to reduce the problem scale, the exclusive relations between values are quantified to reflect the effect of multi-truth, and a fine-grained copy detection method is proposed to deal with complicated source dependency. The data preference of model is also incorporated for fast parameter configuration. Experimental results on real-world and large-scale synthetic datasets demonstrate the effectiveness of our approach, with less execution time and an average 5% higher F1 compared to the latest method.

Original languageEnglish
Pages (from-to)1557-1583
Number of pages27
JournalComputing
Volume106
Issue number5
DOIs
Publication statusPublished - May 2024

Keywords

  • Truth discovery
  • Multi-truth features
  • Bayesian model
  • Source dependence

Fingerprint

Dive into the research topics of 'Generalizing truth discovery by incorporating multi-truth features'. Together they form a unique fingerprint.

Cite this