ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization
Full text PdfPdf (491 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Singapore, Singapore
SESSION: Summarization table of contents
Pages: 307-314  
Year of Publication: 2008
ISBN:978-1-60558-164-4
Authors
Dingding Wang  Florida International University, Miami, FL, USA
Tao Li  Florida International University, Miami, FL, USA
Shenghuo Zhu  NEC Labs. America, Inc, Cupertino, CA, USA
Chris Ding  University of Texas at Arlington, Arlington, TX, USA
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 51,   Downloads (12 Months): 421,   Citation Count: 4
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1390334.1390387
What is a DOI?

Warning: The download time has expired please click on the item to try again.


ABSTRACT

Multi-document summarization aims to create a compressed summary while retaining the main characteristics of the original set of documents. Many approaches use statistics and machine learning techniques to extract sentences from documents. In this paper, we propose a new multi-document summarization framework based on sentence-level semantic analysis and symmetric non-negative matrix factorization. We first calculate sentence-sentence similarities using semantic analysis and construct the similarity matrix. Then symmetric matrix factorization, which has been shown to be equivalent to normalized spectral clustering, is used to group sentences into clusters. Finally, the most informative sentences are selected from each group to form the summary. Experimental results on DUC2005 and DUC2006 data sets demonstrate the improvement of our proposed framework over the implemented existing summarization systems. A further study on the factors that benefit the high performance is also conducted.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
D. Arnold, L. Balkan, S. Meijer, R. Humphreys, and L. Sadler. Machine Translation: an Introductory Guide. Blackwells-NCC, 1994.
 
4
5
6
7
8
 
9
G. Erkan and D. Radev. Lexpagerank: Prestige in multi-document text summarization. In Proceedings of EMNLP 2004.
 
10
C. Fellbaum. WordNet: An Electronic Lexical Database. MIT Press, 1998.
11
12
13
 
14
T. Hirao, Y. Sasaki, and H. Isozaki. An extrinsic evaluation for question-biased text summarization on qa tasks. In Prodeedings of NAACL 2001 workshop on Automatic Summarization.
 
15
 
16
 
17
D. D. Lee and H. S. Seung. Algorithms for non-negative matrix factorization. In NIPS 2001.
18
 
19
 
20
 
21
I. Mani. Automatic summarization. John Benjamins Publishing Company, 2001.
 
22
R. Mihalcea and P. Tarau. A language independent algorithm for single and multiple document summarization. In Proceedings of IJCNLP 2005.
 
23
 
24
S. Park, J.-H. Lee, D.-H. Kim, and C.-M. Ahn. Multi-document summarization based on cluster using non-negtive matrix factorization. In Proceedings of SOFSEM 2007.
 
25
 
26
 
27
 
28
G. Sampathsampath and M. Martinovic. A Multilevel Text Processing Model of Newsgroup Dynamics. 2002.
 
29
D. Shen, J.-T. Sun, H. Li, Q. Yang, and Z. Chen. Document summarization using conditional random fields. In Proceedings of IJCAI 2007.
 
30
31
 
32
X. Wan, J. Yang, and J. Xiao. Manifold-ranking based topic-focused multi-document summarization. In Proceedings of IJCAI 2007.
 
33
W.-T. Yih, J. Goodman, L. Vanderwende, and H. Suzuki. Multi-document summarization by maximizing informative content-words. In Proceedings of IJCAI 2007.
 
34
H. Zha. Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering. In Prodeedings of SIGIR 2005.


Collaborative Colleagues:
Dingding Wang: colleagues
Tao Li: colleagues
Shenghuo Zhu: colleagues
Chris Ding: colleagues