ACM Home Page
Please provide us with feedback. Feedback
Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering
Full text PdfPdf (191 KB)
Source Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Tampere, Finland
SESSION: Summarization table of contents
Pages: 113 - 120  
Year of Publication: 2002
ISBN:1-58113-561-0
Author
Hongyuan Zha  Pennsylvania State University, PA
Sponsor
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 21,   Downloads (12 Months): 210,   Citation Count: 26
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/564376.564398
What is a DOI?

ABSTRACT

A novel method for simultaneous keyphrase extraction and generic text summarization is proposed by modeling text documents as weighted undirected and weighted bipartite graphs. Spectral graph clustering algorithms are useed for partitioning sentences of the documents into topical groups with sentence link priors being exploited to enhance clustering quality. Within each topical group, saliency scores for keyphrases and sentences are generated based on a mutual reinforcement principle. The keyphrases and sentences are then ranked according to their saliency scores and selected for inclusion in the top keyphrase list and summaries of the document. The idea of building a hierarchy of summaries for documents capturing different levels of granularity is also briefly discussed. Our method is illustrated using several examples from news articles, news broadcast transcripts and web documents.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
3
 
4
J. Conroy and D. P. O'Leary. Text Summarization via Hidden Markov Models and Pivoted QR Matrix Decomposition. Technical Report, Dept. Comp. Sci., CS-TR-4221, Univ. Maryland, 2001.
 
5
Document Understanding Conference. http://www-nlpir.nist.gov/projects/duc/.
6
 
7
J.A. Hartigan and M.A. Wong. (1979). A K-means Clustering Algorithm. Applied Statistics, 28:100--108.
 
8
 
9
10
 
11
L. Lovasz and M.D. Plummer. (1986) Matching Theory. Amsterdam: North Holland.
 
12
Z. Luo. Clustering under Spatial Contiguity Constraint: A penalized K-means method. Technical Report, Department of Statistics, Penn State University, 2001.
 
13
I. Mani. Automatic Summarization. John Benjamins Pub Co., 2001.
 
14
15
 
16
M. Porter. The Porter Stemming Algorithm. www.tartarus.org/~martin/PorterStemmer
 
17
L. A. Ramshaw and M. P. Marcus. Text Chunking Using Transformation Based Learning. Proceedings of the Third ACL Workshop on Very Large Corpora", Cambridge MA, USA, 1995.
 
18
G. Salton, A. Singhal, M. Mitra and C. Buckley. Automatic text structuring and summarization. 341--355, Advances in Automatic Text Summarization, edited by I. Mani and M. Maybury, 1999.
 
19
 
20
H. Zha, M. Gu, X. He, C. Ding, and H. Simon. Spectral Relaxation for K-means Clustering. Advances in Neural Information Processing Systems, 14, eds. T. Dietterich, S. Becker, Z. Ghahramani, MIT Press, 2002.
21

CITED BY  26