Thumbnail
Access Restriction
Open

Author Xin, Xin ♦ Li, Juanzi ♦ Tang, Jie ♦ Luo, Qiong
Source CiteSeerX
Content type Text
File Format PDF
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Subject Keyword Unified Approach ♦ Multiple Web Page ♦ Logical Constraint ♦ Information Extraction Task ♦ Complex Structural Dependency ♦ Semantic Information ♦ Constrained Inference Process ♦ Constrained Viterbi Algorithm ♦ Traditional Information Extraction Task ♦ Function Block ♦ Logical Error ♦ Constrained Hierarchical Conditional Random Field ♦ Useoriented Semantic Academic Conference Calendar ♦ Real World Conference Data ♦ Semantic Conference Data ♦ Prototype System ♦ Semantic Web ♦ Hierarchical Conditional Random Field ♦ Academic Conference Homepage ♦ Inference Process ♦ Experimental Result ♦ Conference Function Page
Description We address the problem of academic conference homepage understanding for the Semantic Web. This problem consists of three labeling tasks- labeling conference function pages, function blocks, and attributes. Different from traditional information extraction tasks, the data in academic conference homepages has complex structural dependencies across multiple Web pages. In addition, there are logical constraints in the data. In this paper, we propose a unified approach, Constrained Hierarchical Conditional Random Fields, to accomplish the three labeling tasks simultaneously. In this approach, complex structural dependencies can be well described. Also, the constrained Viterbi algorithm in the inference process can avoid logical errors. Experimental results on real world conference data have demonstrated that this approach performs better than cascaded labeling methods by 3.6 % in F1-measure and that the constrained inference process can improve the accuracy by 14.3%. Based on the proposed approach, we develop a prototype system of useoriented semantic academic conference calendar. The user simply needs to specify what conferences he/she is interested in. Subsequently, the system finds, extracts, and updates the semantic information from the Web, and then builds a calendar automatically for the user. The semantic conference data can be used in other applications, such as finding sponsors and finding experts. The proposed approach can be used in other information extraction tasks as well.
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research
Education Level UG and PG ♦ Career/Technical Study
Learning Resource Type Article
Publisher Date 2008-01-01
Publisher Institution in Proc. the 17th ACM conference on Information and knowledge management