Access Restriction

Author Zhang, Yizhong ♦ Zhao, Mingsheng ♦ Wu, Youshou
Source CiteSeerX
Content type Text
File Format PDF
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Subject Keyword Web Page ♦ Automatic Classification ♦ Neural Network ♦ Keyword Extraction ♦ New Method ♦ Self-organizing Feature Map ♦ Html Tag Information ♦ Field Information ♦ Html Tag ♦ Web Page Classification ♦ Feature Corresponds ♦ Compound Word Processing ♦ Text Keywords ♦ Important Factor ♦ Input Vector ♦ Classification Goal ♦ Self-organized Neural Network Classifier ♦ Final Set ♦ Neural Network Algorithm ♦ Statistical Reduction Procedure ♦ Common Knowledge ♦ Different Class ♦ Tour Field ♦ Proper Neural Network ♦ New Set
Abstract Abstract The web pages classification is certainly important. A technique of extracting field information as common knowledge may be also needed. Compound word processing in keyword extraction from web pages is also one of important factors. In this method, the tour fields are systematically defined at first and the information related to the field is extracted. A new method of extracting feature was considered, which can incorporate three items of information: text, HTML tags and hyperlinks properly. Accordingly, this paper presents a neural network algorithm (Self-organizing feature map) to study on automatic classification of web pages. The proposed approach is based on a new set of features combined with a self-organized neural network classifier. The set of features corresponds to the contents, is selected by using a statistical reduction procedure, and provides text keywords, hyperlink and HTML tags information. The final set of features is then utilized as input vector into a proper neural network to achieve the classification goal. Web pages are classified as different classes. A series of experiments were conducted to evaluate performance of our approach. The results have shown it is quite promising.
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research
Education Level UG and PG ♦ Career/Technical Study
Learning Resource Type Article