ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Update summarization based on novel topic distribution
Full text PdfPdf (370 KB)
Source
Document Engineering archive
Proceedings of the 9th ACM symposium on Document engineering table of contents
Munich, Germany
SESSION: Document and linguistics (II) table of contents
Pages: 205-213  
Year of Publication: 2009
ISBN:978-1-60558-575-8
Authors
Josef Steinberger  University of West Bohemia, Pilsen, Czech Rep
Karel Ježek  University of West Bohemia, Pilsen, Czech Rep
Sponsors
SIGDOC: ACM Special Interest Group for Design of Communications
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 20,   Downloads (12 Months): 42,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1600193.1600239
What is a DOI?

ABSTRACT

This paper deals with our recent research in text summarization. The field has moved from multi-document summarization to update summarization. When producing an update summary of a set of topic-related documents the summarizer assumes prior knowledge of the reader determined by a set of older documents of the same topic. The update summarizer thus must solve a novelty vs. redundancy problem. We describe the development of our summarizer which is based on Iterative Residual Rescaling (IRR) that creates the latent semantic space of a set of documents under consideration. IRR generalizes Singular Value Decomposition (SVD) and enables to control the influence of major and minor topics in the latent space. Our sentence-extractive summarization method computes the redundancy, novelty and significance of each topic. These values are finally used in the sentence selection process. The sentence selection component prevents inner summary redundancy. The results of our participation in TAC evaluation seem to be promising.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Document understanding conference 2007: http://duc.nist.gov/.
 
2
Text analysis conference 2008: http://www.nist.gov/tac/tracks/2008/index.html.
3
 
4
 
5
F. Boudin, M. El-Beze, and J. Torres-Moreno. A scalable mmr approach to sentence scoring for multi-document update summarization. In Proceedings of the 22nd International Conference on Computational Linguistics, 2008.
6
 
7
F. Choi, P. Wiemer-Hastings, and J. Moore. Latent semantic analysis for text segmentation. In Proceedings of EMNLP, 2001.
 
8
 
9
 
10
11
 
12
B. Hachey, G. Murray, and D. Reitter. The embra system at duc 2005: Query-oriented multi-document summarization with a very large latent semantic space. In Proceedings of the Document Understanding Conference, 2005.
 
13
A. Hickl, K. Roberts, and F. Lacatusu. Lcc's gistexter at duc 2007: Machine reading for update summarization. In Proceedings of the Document Understanding Conference, 2007.
 
14
 
15
E. Hovy, C.-Y. Lin, and L. Zhou. Evaluating duc 2005 using basic elements. In Proceedings of the Document Understanding Conference, 2005.
 
16
T. Landauer and S. Dumais. A solution to platos problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104, 1997.
 
17
 
18
C. Lin. Rouge: A package for automatic evaluation of summaries. In Proceedings of the Workshop on Text Summarization Branches Out, 2004.
 
19
 
20
R. Mihalcea and P. Tarau. Text-rank - bringing order into texts. In Proceeding of the Conference on Empirical Methods in Natural Language Processing, 2004.
 
21
R. Mihalcea and P. Tarau. An algorithm for language independent single and multiple document summarization. In Proceedings of the International Joint Conference on Natural Language Processing, 2005.
 
22
G. Murray, S. Renals, and J. Carletta. Extractive summarization of meeting recordings. In Proceedings of Interspeech, 2005.
 
23
A. Nenkova and R. Passonneau. Evaluating content selection in summarization: The pyramid method. In Document Understanding Conference, 2005.
 
24
 
25
J. Steinberger and K. Ježek. Text summarization and singular value decomposition. In Lecture Notes in Computer Science 2457. Springer-Verlag Berlin Heidelberg, 2004.
 
26
J. Steinberger and K. Ježek. Sutler: Update summarizer based on latent topics. In Proceedings of TAC 2008, 2009.
 
27
J. Steinberger and M. Křišt'an. Lsa-based multi-document summarization. In Proceedings of 8th International Workshop on Systems and Control, 2007.
 
28
29
 
30
R. Witte, R. Krestel, and S. Bergler. Generating update summaries for duc 2007. In Proceedings of the Document Understanding Conference, 2007.
 
31
 
32
J. Zhang, X. Cheng, H. Xu, X. Wang, and Y. Zeng. Ictcas's ictgrasper at tac 2008: Summarizing dynamic information with signature terms based content filtering. In Proceedings of TAC 2008, 2009.

Collaborative Colleagues:
Josef Steinberger: colleagues
Karel Ježek: colleagues