SmartVote: a full-fledged graph-based model for multi-valued truth discovery

Xiu Susie Fang, Quan Z. Sheng, Xianzhi Wang, Dianhui Chu, Anne H. H. Ngu

Research output: Contribution to journalArticleResearchpeer-review

Abstract

In the era of Big Data, truth discovery has emerged as a fundamental research topic, which estimates data veracity by determining the reliability of multiple, often conflicting data sources. Although considerable research efforts have been conducted on this topic, most current approaches assume only one true value for each object. In reality, objects with multiple true values widely exist and the existing approaches that cope with multi-valued objects still lack accuracy. In this paper, we propose a full-fledged graph-based model, SmartVote, which models two types of source relations with additional quantification to precisely estimate source reliability for effective multi-valued truth discovery. Two graphs are constructed and further used to derive different aspects of source reliability (i.e., positive precision and negative precision) via random walk computations. Our model incorporates four important implications, including two types of source relations, object popularity, loose mutual exclusion, and long-tail phenomenon on source coverage, to pursue better accuracy in truth discovery. Empirical studies on two large real-world datasets demonstrate the effectiveness of our approach.

LanguageEnglish
Pages1855-1885
Number of pages31
JournalWorld Wide Web
Volume22
Issue number4
Early online date22 Aug 2018
DOIs
Publication statusPublished - Jul 2019

Fingerprint

Big data

Keywords

  • Graph-based model
  • Long-tail phenomenon
  • Multi-valued objects
  • Object popularity
  • Source relations
  • Truth discovery

Cite this

Fang, Xiu Susie ; Sheng, Quan Z. ; Wang, Xianzhi ; Chu, Dianhui ; Ngu, Anne H. H. / SmartVote : a full-fledged graph-based model for multi-valued truth discovery. In: World Wide Web. 2019 ; Vol. 22, No. 4. pp. 1855-1885.
@article{b5993874e6c1465083aaa5e5d8181506,
title = "SmartVote: a full-fledged graph-based model for multi-valued truth discovery",
abstract = "In the era of Big Data, truth discovery has emerged as a fundamental research topic, which estimates data veracity by determining the reliability of multiple, often conflicting data sources. Although considerable research efforts have been conducted on this topic, most current approaches assume only one true value for each object. In reality, objects with multiple true values widely exist and the existing approaches that cope with multi-valued objects still lack accuracy. In this paper, we propose a full-fledged graph-based model, SmartVote, which models two types of source relations with additional quantification to precisely estimate source reliability for effective multi-valued truth discovery. Two graphs are constructed and further used to derive different aspects of source reliability (i.e., positive precision and negative precision) via random walk computations. Our model incorporates four important implications, including two types of source relations, object popularity, loose mutual exclusion, and long-tail phenomenon on source coverage, to pursue better accuracy in truth discovery. Empirical studies on two large real-world datasets demonstrate the effectiveness of our approach.",
keywords = "Graph-based model, Long-tail phenomenon, Multi-valued objects, Object popularity, Source relations, Truth discovery",
author = "Fang, {Xiu Susie} and Sheng, {Quan Z.} and Xianzhi Wang and Dianhui Chu and Ngu, {Anne H. H.}",
year = "2019",
month = "7",
doi = "10.1007/s11280-018-0629-3",
language = "English",
volume = "22",
pages = "1855--1885",
journal = "World Wide Web",
issn = "1386-145X",
publisher = "Springer",
number = "4",

}

SmartVote : a full-fledged graph-based model for multi-valued truth discovery. / Fang, Xiu Susie; Sheng, Quan Z.; Wang, Xianzhi; Chu, Dianhui; Ngu, Anne H. H.

In: World Wide Web, Vol. 22, No. 4, 07.2019, p. 1855-1885.

Research output: Contribution to journalArticleResearchpeer-review

TY - JOUR

T1 - SmartVote

T2 - World Wide Web

AU - Fang, Xiu Susie

AU - Sheng, Quan Z.

AU - Wang, Xianzhi

AU - Chu, Dianhui

AU - Ngu, Anne H. H.

PY - 2019/7

Y1 - 2019/7

N2 - In the era of Big Data, truth discovery has emerged as a fundamental research topic, which estimates data veracity by determining the reliability of multiple, often conflicting data sources. Although considerable research efforts have been conducted on this topic, most current approaches assume only one true value for each object. In reality, objects with multiple true values widely exist and the existing approaches that cope with multi-valued objects still lack accuracy. In this paper, we propose a full-fledged graph-based model, SmartVote, which models two types of source relations with additional quantification to precisely estimate source reliability for effective multi-valued truth discovery. Two graphs are constructed and further used to derive different aspects of source reliability (i.e., positive precision and negative precision) via random walk computations. Our model incorporates four important implications, including two types of source relations, object popularity, loose mutual exclusion, and long-tail phenomenon on source coverage, to pursue better accuracy in truth discovery. Empirical studies on two large real-world datasets demonstrate the effectiveness of our approach.

AB - In the era of Big Data, truth discovery has emerged as a fundamental research topic, which estimates data veracity by determining the reliability of multiple, often conflicting data sources. Although considerable research efforts have been conducted on this topic, most current approaches assume only one true value for each object. In reality, objects with multiple true values widely exist and the existing approaches that cope with multi-valued objects still lack accuracy. In this paper, we propose a full-fledged graph-based model, SmartVote, which models two types of source relations with additional quantification to precisely estimate source reliability for effective multi-valued truth discovery. Two graphs are constructed and further used to derive different aspects of source reliability (i.e., positive precision and negative precision) via random walk computations. Our model incorporates four important implications, including two types of source relations, object popularity, loose mutual exclusion, and long-tail phenomenon on source coverage, to pursue better accuracy in truth discovery. Empirical studies on two large real-world datasets demonstrate the effectiveness of our approach.

KW - Graph-based model

KW - Long-tail phenomenon

KW - Multi-valued objects

KW - Object popularity

KW - Source relations

KW - Truth discovery

UR - http://www.scopus.com/inward/record.url?scp=85052651655&partnerID=8YFLogxK

UR - http://purl.org/au-research/grants/arc/FT140101247

UR - http://purl.org/au-research/grants/arc/DP180102378

U2 - 10.1007/s11280-018-0629-3

DO - 10.1007/s11280-018-0629-3

M3 - Article

VL - 22

SP - 1855

EP - 1885

JO - World Wide Web

JF - World Wide Web

SN - 1386-145X

IS - 4

ER -