Thumbnail
Access Restriction
Subscribed

Author Moustakides, George V ♦ Verykios, Vassilios S
Source ACM Digital Library
Content type Text
Publisher Association for Computing Machinery (ACM)
File Format PDF
Copyright Year ©2009
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Subject Keyword Record-linkage ♦ Duplicate detection ♦ Optimal stopping
Abstract Record-linkage is the process of identifying whether two separate records refer to the same real-world entity when some elements of the record’s identifying information (attributes) agree and others disagree. Existing record-linkage decision methodologies use the outcomes from the comparisons of the whole set of attributes. Here, we propose an alternative scheme that assesses the attributes sequentially, allowing for a decision to made at any attribute’s comparison stage, and thus before exhausting all available attributes. The scheme we develop is optimum in that it minimizes a well-defined average cost criterion while the corresponding optimum solution can be easily mapped into a decision tree to facilitate the record-linkage decision process. Experimental results performed in real datasets indicate the superiority of our methodology compared to existing approaches.
ISSN 19361955
Age Range 18 to 22 years ♦ above 22 year
Educational Use Research
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2009-09-01
Publisher Place New York
e-ISSN 19361963
Journal Journal of Data and Information Quality (JDIQ)
Volume Number 1
Issue Number 2
Page Count 34
Starting Page 1
Ending Page 34


Open content in new tab

   Open content in new tab
Source: ACM Digital Library