ACM Home Page
Please provide us with feedback. Feedback
Extracting sentence segments for text summarization: a machine learning approach
Full text PdfPdf (945 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 23rd annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Athens, Greece
Pages: 152 - 159  
Year of Publication: 2000
ISBN:1-58113-226-3
Authors
Wesley T. Chuang  Computer Science Department, UCLA, Los Angeles, CA and HRL Laboratories, LLC, 3011 Malibu Canyon Road, Malibu, CA
Jihoon Yang  HRL Laboratories, LLC, 3011 Malibu Canyon Road, Malibu, CA
Sponsors
Athens U of Econ & Business : Athens University of Economics and Business
Greek Com Soc : Greek Computer Society
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 9,   Downloads (12 Months): 100,   Citation Count: 15
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/345508.345566
What is a DOI?

ABSTRACT

With the proliferation of the Internet and the huge amount of data it transfers, text summarization is becoming more important. We present an approach to the design of an automatic text summarizer that generates a summary by extracting sentence segments. First, sentences are broken into segments by special cue markers. Each segment is represented by a set of predefined features (e.g. location of the segment, average term frequencies of the words occurring in the segment, number of title words in the segment, and the like). Then a supervised learning algorithm is used to train the summarizer to extract important sentence segments, based on the feature vector. Results of experiments on U.S. patents indicate that the performance of the proposed approach compares very favorably with other approaches (including Microsoft Word summarizer) in terms of precision, recall, and classification accuracy.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
 
4
H. Luhn. The automatic creation of literature abstracts. IBM Journal of Research and Development, 2(2):159-165, 1958.
 
5
W. Mann and S. Thompson. Rhetorical structure theory: Toward a functional theory of text. Text, 8(3):243-281, 1988.
 
6
 
7
 
8
T. Nguyen and V. Srinivasan. Accessing a relational database over the internet using macro language files, 1998. http://www.uspto.gov/.
 
9
 
10
 
11
S. Teufel and M. Moens. Sentence extraction and rhetorical classification for flexible abstracts. In D. Radev and E. Hovy, editors, Intelligent Text Summarization, AAAI Spring Symposium, pages 16-25. AAAI Press, Menlo Park, CA, 1998.
 
12
J. Yang, R. Parekh, and V. Honavar. DistAl: An inter-pattern distance-based constructive learning algorithm. Intelligent Data Analysis, 3:55- 73, 1999.

CITED BY  15

Collaborative Colleagues:
Wesley T. Chuang: colleagues
Jihoon Yang: colleagues