|
ABSTRACT
A novel method for simultaneous keyphrase extraction and generic text summarization is proposed by modeling text documents as weighted undirected and weighted bipartite graphs. Spectral graph clustering algorithms are useed for partitioning sentences of the documents into topical groups with sentence link priors being exploited to enhance clustering quality. Within each topical group, saliency scores for keyphrases and sentences are generated based on a mutual reinforcement principle. The keyphrases and sentences are then ranked according to their saliency scores and selected for inclusion in the top keyphrase list and summaries of the document. The idea of building a hierarchy of summaries for documents capturing different levels of granularity is also briefly discussed. Our method is illustrated using several examples from news articles, news broadcast transcripts and web documents.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
|
 |
3
|
|
| |
4
|
J. Conroy and D. P. O'Leary. Text Summarization via Hidden Markov Models and Pivoted QR Matrix Decomposition. Technical Report, Dept. Comp. Sci., CS-TR-4221, Univ. Maryland, 2001.
|
| |
5
|
Document Understanding Conference. http://www-nlpir.nist.gov/projects/duc/.
|
 |
6
|
|
| |
7
|
J.A. Hartigan and M.A. Wong. (1979). A K-means Clustering Algorithm. Applied Statistics, 28:100--108.
|
| |
8
|
|
| |
9
|
|
 |
10
|
Julian Kupiec , Jan Pedersen , Francine Chen, A trainable document summarizer, Proceedings of the 18th annual international ACM SIGIR conference on Research and development in information retrieval, p.68-73, July 09-13, 1995, Seattle, Washington, United States
[doi> 10.1145/215206.215333]
|
| |
11
|
L. Lovasz and M.D. Plummer. (1986) Matching Theory. Amsterdam: North Holland.
|
| |
12
|
Z. Luo. Clustering under Spatial Contiguity Constraint: A penalized K-means method. Technical Report, Department of Statistics, Penn State University, 2001.
|
| |
13
|
I. Mani. Automatic Summarization. John Benjamins Pub Co., 2001.
|
| |
14
|
|
 |
15
|
|
| |
16
|
M. Porter. The Porter Stemming Algorithm. www.tartarus.org/~martin/PorterStemmer
|
| |
17
|
L. A. Ramshaw and M. P. Marcus. Text Chunking Using Transformation Based Learning. Proceedings of the Third ACL Workshop on Very Large Corpora", Cambridge MA, USA, 1995.
|
| |
18
|
G. Salton, A. Singhal, M. Mitra and C. Buckley. Automatic text structuring and summarization. 341--355, Advances in Automatic Text Summarization, edited by I. Mani and M. Maybury, 1999.
|
| |
19
|
|
| |
20
|
H. Zha, M. Gu, X. He, C. Ding, and H. Simon. Spectral Relaxation for K-means Clustering. Advances in Neural Information Processing Systems, 14, eds. T. Dietterich, S. Becker, Z. Ghahramani, MIT Press, 2002.
|
 |
21
|
|
CITED BY 26
|
|
Hui Han , C. Lee Giles , Eren Manavoglu , Hongyuan Zha , Zhenyue Zhang , Edward A. Fox, Automatic document metadata extraction using support vector machines, Proceedings of the 3rd ACM/IEEE-CS joint conference on Digital libraries, May 27-31, 2003, Houston, Texas
|
|
|
|
|
|
|
|
|
|
|
|
Jian-Tao Sun , Xuanhui Wang , Dou Shen , Hua-Jun Zeng , Zheng Chen, CWS: a comparative web search system, Proceedings of the 15th international conference on World Wide Web, May 23-26, 2006, Edinburgh, Scotland
|
|
|
|
|
|
|
|
|
|
|
|
Xuanhui Wang , Jian-Tao Sun , Zheng Chen , ChengXiang Zhai, Latent semantic analysis for multiple-type interrelated data objects, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Hua Li , Duo Zhang , Jian Hu , Hua-Jun Zeng , Zheng Chen, Finding keyword from online broadcasting content for targeted advertising, Proceedings of the 1st international workshop on Data mining and audience intelligence for advertising, p.55-62, August 12-12, 2007, San Jose, California
|
|
|
|
|
|
|
|
|
Jin Zhang , Xueqi Cheng , Gaowei Wu , Hongbo Xu, AdaSum: an adaptive model for summarization, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
Jiang Bian , Yandong Liu , Ding Zhou , Eugene Agichtein , Hongyuan Zha, Learning to recognize reliable users and content in social media with coupled mutual reinforcement, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
Liangda Li , Ke Zhou , Gui-Rong Xue , Hongyuan Zha , Yong Yu, Enhancing diversity, coverage and balance for summarization through structure learning, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
|
|
|
|
|