Thumbnail
Access Restriction
Subscribed

Author Wensheng Wu ♦ AnHai Doan ♦ Clement Yu
Source IEEE Xplore Digital Library
Content type Text
Publisher Institute of Electrical and Electronics Engineers, Inc. (IEEE)
File Format PDF
Copyright Year ©2005
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Computer programming, programs & data
Subject Keyword Merging ♦ Databases ♦ Constraint optimization ♦ Approximation algorithms ♦ Clustering algorithms ♦ Vocabulary
Abstract We consider the problem of integrating a large number of interface schemas over the deep Web, The scale of the problem and the diversity of the sources present serious challenges to the conventional manual or rule-based approaches to schema integration. To address these challenges, we propose a novel formulation of schema integration as an optimization problem, with the objective of maximally satisfying the constraints given by individual schemas. Since the optimization problem can be shown to be NP-complete, we develop a novel approximation algorithm LMax, which builds the unified schema via recursive applications of clustering aggregation. We further extend LMax to handle the irregularities frequently occurring among the interface schemas. Extensive evaluation on real-world data sets shows the effectiveness of our approach.
Description Author affiliation: Illinois Univ., Urbana, IL, USA (Wensheng Wu; AnHai Doan)
ISBN 0769522785
ISSN 15504786
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research ♦ Reading
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2005-11-27
Publisher Place USA
Rights Holder Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Size (in Bytes) 128.21 kB


Source: IEEE Xplore Digital Library