Thumbnail
Access Restriction
Subscribed

Author Qi, Xiaoguang ♦ Davison, Brian D.
Source ACM Digital Library
Content type Text
Publisher Association for Computing Machinery (ACM)
File Format PDF
Copyright Year ©2009
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Subject Keyword Categorization ♦ Web mining
Abstract Classification of Web page content is essential to many tasks in Web information retrieval such as maintaining Web directories and focused crawling. The uncontrolled nature of Web content presents additional challenges to Web page classification as compared to traditional text classification, but the interconnected nature of hypertext also provides features that can assist the process. As we review work in Web page classification, we note the importance of these Web-specific features and algorithms, describe state-of-the-art practices, and track the underlying assumptions behind the use of information from neighboring pages.
ISSN 03600300
Age Range 18 to 22 years ♦ above 22 year
Educational Use Research
Education Level UG and PG
Learning Resource Type Article
Publisher Date 2009-02-01
Publisher Place New York
e-ISSN 15577341
Journal ACM Computing Surveys (CSUR)
Volume Number 41
Issue Number 2
Page Count 31
Starting Page 1
Ending Page 31


Open content in new tab

   Open content in new tab
Source: ACM Digital Library