Access Restriction

Author Schenker, A. ♦ Last, M. ♦ Kandel, A.
Sponsorship Int. Fuzzy Syst. Assoc.
Source IEEE Xplore Digital Library
Content type Text
Publisher Institute of Electrical and Electronics Engineers, Inc. (IEEE)
File Format PDF
Copyright Year ©2001
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Special computer methods
Subject Keyword Clustering algorithms ♦ Web search ♦ Data mining ♦ Natural language processing ♦ Systems engineering and theory ♦ Search engines ♦ Content based retrieval ♦ Knowledge representation ♦ Information retrieval ♦ Web mining
Abstract In this paper we introduce the novel class hierarchy construction algorithm (CHCA) in order to create hierarchical clusterings of Web documents. Unlike most clustering methods, CHCA operates on nominal data (the words occurring in each document) and it differs from other hierarchical clustering techniques in that it uses the object-oriented concept of inheritance to create the parent/child relationship between clusters. A prototype system has been developed using CHCA to create cluster hierarchies from web search results returned by conventional search engines. CHCA, without any guidance, creates term-based clusters from the contents of the retrieved pages and assigns each page to a cluster; the clusters correspond to topics and sub-topics in the investigated domain. The performance of our system is compared with a similar web search clustering system (Vivisimo).
Description Author affiliation: Dept. of Comput. Sci. & Eng., Univ. of South Florida, Tampa, FL, USA (Schenker, A.)
ISBN 0780370783
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research ♦ Reading
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2001-07-25
Publisher Place Canada
Rights Holder Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Size (in Bytes) 596.77 kB
Page Count 6
Starting Page 3076
Ending Page 3081

Source: IEEE Xplore Digital Library