Skip to main navigation Skip to search Skip to main content

Multi-truth discovery while being aware of unbalanced data distribution

Xiu Susie Fang, Quan Z. Sheng, Guohao Sun*, Shan Chang, Hongya Wang, Jian Yang

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contributionpeer-review

Abstract

Due to information explosion, conflicting data on the same object among multiple sources is ubiquitous on the Web. To solve those conflicts while estimating source reliability, truth discovery has become a hot topic. However, when considering multi-value objects, the inevitable unbalanced data distribution is overlooked by the existing approaches. In particular, only a few sources make lots of claims while most sources only provide a few claims, which renders the source reliability estimated for “small” sources totally random; Some objects are covered by plenty of sources while some objects are claimed by only a few sources, which causes the value correctness calculated for “cold” objects unreasonable. To tackle the unbalanced data where multi-value objects exist, we propose a confidence interval based approach (CIMTD). We estimate source reliability from two aspects, i.e., the ability to claim the correct number of value(s) and specific value(s) on an object. To reflect the real reliability for both “big” and “small” sources, confidence intervals of enriched estimation are considered. While estimating source reliability, uncertainty degrees are introduced to model object differences. Confidence intervals are also considered to reflect the real uncertainty for both “hot” and “cold” objects. Experimental results on two real-world datasets demonstrate the effectiveness of our approach.
Original languageEnglish
Title of host publication2023 International Joint Conference on Neural Networks (IJCNN)
Place of PublicationPiscataway, NJ
PublisherInstitute of Electrical and Electronics Engineers (IEEE)
Number of pages10
ISBN (Electronic)9781665488679
ISBN (Print)9781665488686
DOIs
Publication statusPublished - 2023
Event2023 International Joint Conference on Neural Networks, IJCNN 2023 - Gold Coast, Australia
Duration: 18 Jun 202323 Jun 2023

Publication series

Name
ISSN (Print)2161-4393
ISSN (Electronic)2161-4407

Conference

Conference2023 International Joint Conference on Neural Networks, IJCNN 2023
Country/TerritoryAustralia
CityGold Coast
Period18/06/2323/06/23

Fingerprint

Dive into the research topics of 'Multi-truth discovery while being aware of unbalanced data distribution'. Together they form a unique fingerprint.

Cite this