|
ABSTRACT
This paper deals with our recent research in text summarization. The field has moved from multi-document summarization to update summarization. When producing an update summary of a set of topic-related documents the summarizer assumes prior knowledge of the reader determined by a set of older documents of the same topic. The update summarizer thus must solve a novelty vs. redundancy problem. We describe the development of our summarizer which is based on Iterative Residual Rescaling (IRR) that creates the latent semantic space of a set of documents under consideration. IRR generalizes Singular Value Decomposition (SVD) and enables to control the influence of major and minor topics in the latent space. Our sentence-extractive summarization method computes the redundancy, novelty and significance of each topic. These values are finally used in the sentence selection process. The sentence selection component prevents inner summary redundancy. The results of our participation in TAC evaluation seem to be promising.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Document understanding conference 2007: http://duc.nist.gov/.
|
| |
2
|
Text analysis conference 2008: http://www.nist.gov/tac/tracks/2008/index.html.
|
 |
3
|
|
| |
4
|
|
| |
5
|
F. Boudin, M. El-Beze, and J. Torres-Moreno. A scalable mmr approach to sentence scoring for multi-document update summarization. In Proceedings of the 22nd International Conference on Computational Linguistics, 2008.
|
 |
6
|
|
| |
7
|
F. Choi, P. Wiemer-Hastings, and J. Moore. Latent semantic analysis for text segmentation. In Proceedings of EMNLP, 2001.
|
| |
8
|
|
| |
9
|
|
| |
10
|
|
 |
11
|
|
| |
12
|
B. Hachey, G. Murray, and D. Reitter. The embra system at duc 2005: Query-oriented multi-document summarization with a very large latent semantic space. In Proceedings of the Document Understanding Conference, 2005.
|
| |
13
|
A. Hickl, K. Roberts, and F. Lacatusu. Lcc's gistexter at duc 2007: Machine reading for update summarization. In Proceedings of the Document Understanding Conference, 2007.
|
| |
14
|
|
| |
15
|
E. Hovy, C.-Y. Lin, and L. Zhou. Evaluating duc 2005 using basic elements. In Proceedings of the Document Understanding Conference, 2005.
|
| |
16
|
T. Landauer and S. Dumais. A solution to platos problem: The latent semantic analysis theory of the acquisition, induction, and representation of knowledge. Psychological Review, 104, 1997.
|
| |
17
|
|
| |
18
|
C. Lin. Rouge: A package for automatic evaluation of summaries. In Proceedings of the Workshop on Text Summarization Branches Out, 2004.
|
| |
19
|
|
| |
20
|
R. Mihalcea and P. Tarau. Text-rank - bringing order into texts. In Proceeding of the Conference on Empirical Methods in Natural Language Processing, 2004.
|
| |
21
|
R. Mihalcea and P. Tarau. An algorithm for language independent single and multiple document summarization. In Proceedings of the International Joint Conference on Natural Language Processing, 2005.
|
| |
22
|
G. Murray, S. Renals, and J. Carletta. Extractive summarization of meeting recordings. In Proceedings of Interspeech, 2005.
|
| |
23
|
A. Nenkova and R. Passonneau. Evaluating content selection in summarization: The pyramid method. In Document Understanding Conference, 2005.
|
| |
24
|
|
| |
25
|
J. Steinberger and K. Ježek. Text summarization and singular value decomposition. In Lecture Notes in Computer Science 2457. Springer-Verlag Berlin Heidelberg, 2004.
|
| |
26
|
J. Steinberger and K. Ježek. Sutler: Update summarizer based on latent topics. In Proceedings of TAC 2008, 2009.
|
| |
27
|
J. Steinberger and M. Křišt'an. Lsa-based multi-document summarization. In Proceedings of 8th International Workshop on Systems and Control, 2007.
|
| |
28
|
|
 |
29
|
|
| |
30
|
R. Witte, R. Krestel, and S. Bergler. Generating update summaries for duc 2007. In Proceedings of the Document Understanding Conference, 2007.
|
| |
31
|
|
| |
32
|
J. Zhang, X. Cheng, H. Xu, X. Wang, and Y. Zeng. Ictcas's ictgrasper at tac 2008: Summarizing dynamic information with signature terms based content filtering. In Proceedings of TAC 2008, 2009.
|
|