| Cross-document summarization by concept classification |
| Full text |
Pdf
(246 KB)
|
| Source
|
Annual ACM Conference on Research and Development in Information Retrieval
archive
Proceedings of the 25th annual international ACM SIGIR conference on Research and development in information retrieval
table of contents
Tampere, Finland
SESSION: Summarization
table of contents
Pages: 121 - 128
Year of Publication: 2002
ISBN:1-58113-561-0
|
|
Authors
|
|
Hilda Hardy
|
NLIP Laboratory, University at Albany, Albany, NY
|
|
Nobuyuki Shimizu
|
NLIP Laboratory, University at Albany, Albany, NY
|
|
Tomek Strzalkowski
|
NLIP Laboratory, University at Albany, Albany, NY
|
|
Liu Ting
|
NLIP Laboratory, University at Albany, Albany, NY
|
|
Xinyang Zhang
|
NLIP Laboratory, University at Albany, Albany, NY
|
|
G. Bowden Wise
|
GE Global Research Center, Niskayuna, NY
|
|
| Sponsor |
|
| Publisher |
|
| Bibliometrics |
Downloads (6 Weeks): 9, Downloads (12 Months): 99, Citation Count: 9
|
|
|
ABSTRACT
In this paper we describe a Cross Document Summarizer XDoX designed specifically to summarize large document sets (50-500 documents and more). Such sets of documents are typically obtained from routing or filtering systems run against a continuous stream of data, such as a newswire. XDoX works by identifying the most salient themes within the set (at the granularity level that is regulated by the user) and composing an extraction summary, which reflects these main themes. In the current version, XDoX is not optimized to produce a summary based on a few unrelated documents; indeed, such summaries are best obtained simply by concatenating summaries of individual documents. We show examples of summaries obtained in our tests as well as from our participation in the first Document Understanding Conference (DUC).
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
| |
2
|
Fellbaum, C. (ed.). WordNet - An Electronic Lexical Database. MIT Press, 1998.
|
| |
3
|
Firmin, T., and Chrzanowski, M. J. An Evaluation of Automatic Text Summarization Systems. In I. Mani and M. Maybury (eds.), Advances in Automatic Text Summarization. MIT Press, 1999.
|
| |
4
|
Hatzivassiloglou, V., Klavans, J. L., Holcombe, M. L., Barzilay, R., Kan, M., and McKeown, K. R. SimFinder: A Flexible Clustering Tool for Summarization. In NAACL 2001 Workshop on Automatic Summarization (Pittsburgh, PA), 41-49.
|
| |
5
|
|
| |
6
|
Kraaij, W., Spitters, M., and van der Heijden, M. Combining a mixture language model and Naïve Bayes for multi-document summarization. In SIGIR 2001 Workshop on Text Summarization (New Orleans, LA), 95-103.
|
| |
7
|
Lin, C. and Hovy, E. NEATS: A Multidocument Summarizer. In SIGIR 2001 Workshop on Text Summarization (New Orleans, LA), 131-134.
|
| |
8
|
Marcu, D. Discourse-Based Summarization in DUC-2001. In SIGIR 2001 Workshop on Text Summarization (New Orleans, LA), 109--116.
|
 |
9
|
|
 |
10
|
|
| |
11
|
Mitra, M., Singhal, A., and Buckley, C. Automatic text summarization by paragraph extraction. In Proceedings of the ACL'97/EACL'97 Workshop on Intelligent Scalable Text Summarization (Madrid, Spain, 1997).
|
| |
12
|
Over, P. Introduction to DUC-2001: an Intrinsic Evaluation of Generic News Text Summarization Systems. http://www.itl.nist.gov/iaui/894.02/projects/duc/duc2001/pauls_slides/index.htm.
|
| |
13
|
Radev, D. R., Fan, W., and Zhang, Z. WebInEssence: A Personalized Web-Based Multi-Document Summarization and Recommendation System. In NAACL 2001 Workshop on Automatic Summarization (Pittsburgh, PA), 79--88.
|
| |
14
|
Robertson, S. E., Walker, S., Jones, S., Hancock-Beaulieu, M. M., and Gatford, M. Okapi at TREC-3. In Harman, D. (ed.), The Third Text Retrieval Conference (TREC-3). National Institute of Standards and Technology Special Publication 500-225, 1995, 219-230.
|
 |
15
|
|
| |
16
|
Stein, G., Strzalkowski, T., and Wise, B. Interactive, Text-Based Summarization of Multiple Documents. Computational Intelligence 16, 4 (2000), 606-613.
|
| |
17
|
Strzalkowski, T., Stein, G., Wang, J., and Wise, B. A Robust, Practical Text Summarizer. In I. Mani and M. Maybury (eds.), Advances in Automatic Text Summarization. MIT Press, 1999, 137-154.
|
| |
18
|
|
CITED BY 9
|
|
|
|
|
|
|
|
|
|
|
Sharon Small , Tomek Strzalkowski , Ting Liu , Sean Ryan , Robert Salkin , Nobuyuki Shimizu , Paul Kantor , Diane Kelly , Robert Rittman , Nina Wacholder, HITIQA: towards analytical question answering, Proceedings of the 20th international conference on Computational Linguistics, p.1291-es, August 23-27, 2004, Geneva, Switzerland
|
|
|
|
|
|
|
|
|
Liangda Li , Ke Zhou , Gui-Rong Xue , Hongyuan Zha , Yong Yu, Enhancing diversity, coverage and balance for summarization through structure learning, Proceedings of the 18th international conference on World wide web, April 20-24, 2009, Madrid, Spain
|
|
|
|
|
|
|
|