Thumbnail
Access Restriction
Open

Author Ishikawa, Kai ♦ Ando, Shinichi ♦ Okumura, Akitoshi
Source CiteSeerX
Content type Text
File Format PDF
Language English
Subject Domain (in DDC) Computer science, information & general works ♦ Data processing & computer science
Subject Keyword Hybrid Text Summarization Method ♦ Open Test Evaluation ♦ Sentence Extraction Sub-task ♦ Newspaper Article ♦ Rectangular Distribution Function ♦ Lead Sentence Extraction Method ♦ Lead Method ♦ Tf Method ♦ Formal Run Test Data ♦ Distribution Function ♦ Ntcir-2 Workshop Tsc Task ♦ Average Value ♦ Tf-based Sentence Extraction Method ♦ Document Summarization ♦ Tf-based Score
Description This paper describes a hybrid text summarization method based on a TF-based sentence extraction method and a LEAD sentence extraction method. The LEAD method is known to be effective than other methods for document summarization of newspapers in lower summarization (output-to-input) ratio. In order to combine the LEAD method with the TF method, we used a rectangular distribution function that determines the importance of sentences according to their position in a document. With our method, the importance of a sentence is determined by multiplying the TF-based score and the distribution function. We conducted open test evaluation using the formal run test data of sentence extraction sub-task in NTCIR-2 Workshop TSC task (30 newspaper articles). The proposed method was tested by the average values of F-measure for 10%, 30%, and 50% summaries, and proved 34.1% for TF method, 39.1% for LEAD method, and 42.4% for the proposed method. Keywords: TF, LEAD, headline, hybrid, position, distribution. 1
Educational Role Student ♦ Teacher
Age Range above 22 year
Educational Use Research
Education Level UG and PG ♦ Career/Technical Study
Learning Resource Type Article
Publisher Date 2001-01-01
Publisher Institution In Proceedings of the 2nd National Institute of Informatics Test Collection Information Retrieval (NTCIR) Workshop