Thumbnail
Access Restriction
Open

Author Kotidis, Yannis ♦ Marian, Amélie
Source CiteSeerX
Content type Text
File Format PDF
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Subject Keyword Quality Score ♦ Real Data ♦ Mjp Framework ♦ Multiple Database ♦ Account Data Quality ♦ Individual Data Path ♦ Integrity Constraint ♦ Virtual Integration Prototype Testbed ♦ Field Value ♦ Multiple Data Path ♦ Corroborating Evidence ♦ Novel Technique ♦ Candidate Answer ♦ High Quality Information ♦ Poor Quality Data ♦ Functional Dependency ♦ Multiple Join Path
Description We propose the Multiple Join Path (MJP) framework for obtaining high quality information by linking fields across multiple databases, when the underlying databases have poor quality data, which are characterized by violations of integrity constraints like keys and functional dependencies within and across databases. MJP associates quality scores with candidate answers by first scoring individual data paths between a pair of field values taking into account data quality with respect to specified integrity constraints, and then agglomerating scores across multiple data paths that serve as corroborating evidences for a candidate answer. We address the problem of finding the top-few (highest quality) answers in the MJP framework using novel techniques, and demonstrate the utility of our techniques using real data and our Virtual Integration Prototype testbed.
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research
Education Level UG and PG ♦ Career/Technical Study
Learning Resource Type Article
Publisher Date 2006-01-01
Publisher Institution IN PROC. OF THE FIRST INTERNATIONAL VLDB WORKSHOP ON CLEAN DATABASES (CLEANDB’06