ACM Home Page
Please provide us with feedback. Feedback
Automatic evaluation of summaries using N-gram co-occurrence statistics
Full text PdfPdf (420 KB)
Source North American Chapter Of The Association For Computational Linguistics archive
Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology - Volume 1 table of contents
Edmonton, Canada
Pages: 71 - 78  
Year of Publication: 2003
Authors
Chin-Yew Lin  University of Southern California, Marina del Rey, CA
Eduard Hovy  University of Southern California, Marina del Rey, CA
Publisher
Association for Computational Linguistics  Morristown, NJ, USA
Bibliometrics
Downloads (6 Weeks): 29,   Downloads (12 Months): 87,   Citation Count: 70
Additional Information:

abstract   references   cited by   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: 10.3115/1073445.1073465

ABSTRACT

Following the recent adoption by the machine translation community of automatic evaluation using the BLEU/NIST scoring process, we conduct an in-depth study of a similar idea for evaluating summaries. The results show that automatic evaluation using unigram co-occurrences between summary pairs correlates surprising well with human evaluations, based on various statistical metrics; while direct application of the BLEU evaluation procedure does not always give good results.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
DUC. 2002. The Document Understanding Conference. http://duc.nist.gov.
 
3
Fukusima, T. and Okumura, M. 2001. Text Summarization Challenge: Text Summarization Evaluation at NTCIR Workshop2. In Proceedings of the Second NTCIR Workshop on Research in Chinese & Japanese Text Retrieval and Text Summarization, NII, Tokyo, Japan, 2001.
 
4
Lin, C.-Y. 2001. Summary Evaluation Environment. http://www.isi.edu/~cyl/SEE.
 
5
 
6
McKeown, K., R. Barzilay, D. Evans, V. Hatzivassiloglou, J. L. Klavans, A. Nenkova, C. Sable, B. Schiffman, S. Sigelman. Tracking and Summarizing News on a Daily Basis with Columbia's Newsblaster. In Proceedings of Human Language Technology Conference 2002 (HLT 2002). San Diego, CA, 2002.
 
7
Mani, I., D. House, G. Klein, L. Hirschman, L. Obrst, T. Firmin, M. Chrzanowski, and B. Sundheim. 1998. The TIPSTER SUMMAC Text Summarization Evaluation: Final Report. MITRE Corp. Tech. Report.
 
8
NIST. 2002. Automatic Evaluation of Machine Translation Quality using N-gram Co-Occurrence Statistics.
 
9
Over, P. 2003. Personal Communication.
 
10
Papineni, K., S. Roukos, T. Ward, W.-J. Zhu. 2001. BLEU: a Method for Automatic Evaluation of Machine Translation. IBM Research Report RC22176 (W0109-022).
 
11
Porter, M. F. 1980. An Algorithm for Suffix Stripping. Program, 14, pp. 130--137.
 
12
 
13
 
14
Rath, G. J., Resnick, A., and Savage, T. R. 1961. The Formation of Abstracts by the Selection of Sentences. American Documentation, 12(2), pp. 139--143. Reprinted in Mani, I., and Maybury, M., eds, Advances in Automatic Text Summarization, MIT Press, pp. 287--292.
 
15
WAS. 2000. Workshop on Automatic Summarization, post-conference workshop of ANLP-NAACL-2000, Seattle, WA, 2000.
 
16
WAS. 2001. Workshop on Automatic Summarization, pre-conference workshop of NAACL-2001, Pittsburgh, PA, 2001.
 
17
WAS. 2002. Workshop on Automatic Summarization, post-conference workshop of ACL-2002, Philadelphia, PA, 2002.

CITED BY  70
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Collaborative Colleagues:
Chin-Yew Lin: colleagues
Eduard Hovy: colleagues