NokeaRM

employing non-key attributes in record matching

Qiang Yang, Zhixu Li*, Jun Jiang, Pengpeng Zhao, Guanfeng Liu, An Liu, Jia Zhu

*Corresponding author for this work

Research output: Chapter in Book/Report/Conference proceedingConference proceeding contribution

4 Citations (Scopus)

Abstract

Record Matching (RM) aims at finding out pairs of instances referring to the same entity between relational tables. Existing RM methods mainly work on key attribute values, but neglect the possible effectiveness of non-key attribute values in RM. As a result, when two instances referring to the same entity do not have similar key attribute values, they are unlikely to be linked as an instance pair. On the other hand, the two instances may share some important non-key attribute values which can also help us identify the relationship between them. With this intuition, we propose to employ non-key attributes in RM. Basically, we propose a rule-based algorithm based on a tree-like structure, which can not only deal with noisy and missing values, but also greatly improve the efficiency of the method by finding out matched instances or filtering unmatched instances as early as possible. The experimental results based on several data sets demonstrate that our method outperforms existing RM methods by reaching a higher precision and recall. Besides, the proposed techniques can greatly improve the efficiency of a baseline.

Original languageEnglish
Title of host publicationWeb-Age Information Management
Subtitle of host publication16th International Conference, WAIM 2015, Proceedings
EditorsXin Luna Dong, Xiaohui Yu, Jian Li, Yizhou Sun
Place of PublicationCham, Switzerland
PublisherSpringer, Springer Nature
Pages438-442
Number of pages5
ISBN (Electronic)9783319210421
ISBN (Print)9783319210414
DOIs
Publication statusPublished - 2015
Externally publishedYes
Event16th International Conference on Web-Age Information Management, WAIM 2015 - Qingdao, China
Duration: 8 Jun 201510 Jun 2015

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9098
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other16th International Conference on Web-Age Information Management, WAIM 2015
CountryChina
CityQingdao
Period8/06/1510/06/15

Keywords

  • Algorithm
  • Non-key attribute
  • Record matching

Fingerprint Dive into the research topics of 'NokeaRM: employing non-key attributes in record matching'. Together they form a unique fingerprint.

  • Cite this

    Yang, Q., Li, Z., Jiang, J., Zhao, P., Liu, G., Liu, A., & Zhu, J. (2015). NokeaRM: employing non-key attributes in record matching. In X. L. Dong, X. Yu, J. Li, & Y. Sun (Eds.), Web-Age Information Management: 16th International Conference, WAIM 2015, Proceedings (pp. 438-442). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9098). Cham, Switzerland: Springer, Springer Nature. https://doi.org/10.1007/978-3-319-21042-1_36