Access Restriction

Author Wu, Fei ♦ Weld, Daniel S.
Source CiteSeerX
Content type Text
File Format PDF
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Subject Keyword Full Power ♦ Rich Ontology ♦ Training Data ♦ Machine Learning Problem ♦ Wikipedia Infoboxes ♦ Natural Language Text ♦ Human Volunteer ♦ Wikipedia Infobox-class Schema ♦ Combined Effort ♦ Improved Query Processing ♦ Ontology Refinement ♦ Wikipedia Infobox Ontology ♦ Wikipedia Infobox-class Ontology ♦ Joint-inference Approach ♦ Semantic Knowledge ♦ Machine-harvestable Object-attribute-value Triple ♦ Cleanly-structured Ontology ♦ Subject Descriptor ♦ Markov Logic Network ♦ Present Experiment ♦ Powerful Joint-inference Approach ♦ Autonomous System ♦ Numerous Fact
Description The combined efforts of human volunteers have recently extracted numerous facts from Wikipedia, storing them as machine-harvestable object-attribute-value triples in Wikipedia infoboxes. Machine learning systems, such as Kylin, use these infoboxes as training data, accurately extracting even more semantic knowledge from natural language text. But in order to realize the full power of this information, it must be situated in a cleanly-structured ontology. This paper introduces KOG, an autonomous system for refining Wikipedia’s infobox-class ontology towards this end. We cast the problem of ontology refinement as a machine learning problem and solve it using both SVMs and a more powerful joint-inference approach expressed in Markov Logic Networks. We present experiments demonstrating the superiority of the joint-inference approach and evaluating other aspects of our system. Using these techniques, we build a rich ontology, integrating Wikipedia’s infobox-class schemata with WordNet. We demonstrate how the resulting ontology may be used to enhance Wikipedia with improved query processing and other features. Categories and Subject Descriptors:
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research
Education Level UG and PG ♦ Career/Technical Study
Learning Resource Type Article
Publisher Institution In: Proceedings of the 17th international conference on World Wide Web