ACM Home Page
Please provide us with feedback. Feedback
Enhancing diversity, coverage and balance for summarization through structure learning
Full text PdfPdf (782 KB)
Source
International World Wide Web Conference archive
Proceedings of the 18th international conference on World wide web table of contents
Madrid, Spain
SESSION: Data mining/session: text mining table of contents
Pages 71-80  
Year of Publication: 2009
ISBN:978-1-60558-487-4
Authors
Liangda Li  Shanghai Jiao-Tong University, Shanghai, China
Ke Zhou  Shanghai Jiao-Tong University, Shanghai, China
Gui-Rong Xue  Shanghai Jiao-Tong University, Shanghai, China
Hongyuan Zha  Georgia Institute of Technology, Atlanta, GA, USA
Yong Yu  Shanghai Jiao-Tong University, Shanghai, China
Sponsor
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 44,   Downloads (12 Months): 178,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1526709.1526720
What is a DOI?

ABSTRACT

Document summarization plays an increasingly important role with the exponential growth of documents on the Web. Many supervised and unsupervised approaches have been proposed to generate summaries from documents. However, these approaches seldom simultaneously consider summary diversity, coverage, and balance issues which to a large extent determine the quality of summaries. In this paper, we consider extract-based summarization emphasizing the following three requirements: 1) diversity in summarization, which seeks to reduce redundancy among sentences in the summary; 2) sufficient coverage, which focuses on avoiding the loss of the document's main information when generating the summary; and 3) balance, which demands that different aspects of the document need to have about the same relative importance in the summary. We formulate the extract-based summarization problem as learning a mapping from a set of sentences of a given document to a subset of the sentences that satisfies the above three requirements. The mapping is learned by incorporating several constraints in a structure learning framework, and we explore the graph structure of the output variables and employ structural SVM for solving the resulted optimization problem. Experiments on the DUC2001 data sets demonstrate significant performance improvements in terms of F1 and ROUGE metrics.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Ibm many aspects document summarization tool, http://www.alphaworks.ibm.com/tech/manyaspects.
 
2
3
4
5
6
 
7
G. ErKan and D. R. Radev. Lexpagerank: Prestige in multi-document text summarization. In EMNLP, Barcelona, Spain, 2004.
 
8
J. Friedman, T. Hastie, and R. Tibshirani. The Elements of Statistical Learning: Data Mining, Inference and Prediction. newblock 2001.
 
9
10
11
12
 
13
 
14
15
 
16
17
 
18
 
19
D. Metzler and T. Kanungo. Machine learned sentence selection strategies for query-biased summarization. In SIGIR, 2008.
 
20
R. Mihalcea. Language independent extractive summarization. In AAAI, pages 1688--1689, 2005.
 
21
R. Mihalcea and P. Tarau. Textrank: Bringing order into texts. In EMNLP, Barcelona, Spain, 2004.
22
23
 
24
25
 
26
D. Shen, J. T. Sun, H. Li, Q. Yang, and Z. Chen. Document summarization using conditional random fields. In IJCAI, pages 2862--2867, 2007.
27
 
28
 
29
K. Wagsta, M. desJardins, E. Eaton, and J. Montminy. Learning and visualizing user preferences over sets. In AAAI, 2007.
30
31
32

Collaborative Colleagues:
Liangda Li: colleagues
Ke Zhou: colleagues
Gui-Rong Xue: colleagues
Hongyuan Zha: colleagues
Yong Yu: colleagues