|
ABSTRACT
Hierarchies have long been used for organization, summarization, and access to information. In this paper we define summarization in terms of a probabilistic language model and use the definition to explore a new technique for automatically generating topic hierarchies by applying a graph-theoretic algorithm, which is an approximation of the Dominating Set Problem. The algorithm efficiently chooses terms according to a language model. We compare the new technique to previous methods proposed for constructing topic hierarchies including subsumption and lexical hierarchies, as well as the top TF.IDF terms. Our results show that the new technique consistently performs as well as or better than these other techniques. They also show the usefulness of hierarchies compared with a list of terms.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
 |
1
|
|
 |
2
|
|
 |
3
|
|
| |
4
|
|
| |
5
|
M. Hearst. User interfaces and visualization. In R. Baeza-Yates and B. Riberio-Neto, editors, Modern Information Retrieval, pages 257-323. ACM Press Series, 1999.
|
 |
6
|
|
| |
7
|
D. Lawrie and W. Croft. Discovering and comparing topic hierarchies. In Proceedings of RIAO 2000 Conference, pages 314-330, 2000.
|
| |
8
|
H. Lowe and G. Barnett. Understanding and using the medical subject headings (mesh) vocabulary to perform literature searches. Journal of the American Medical Association, 271(4):1103-1108, 1994.
|
| |
9
|
Kathleen R. McKeown , Judith L. Klavans , Vasileios Hatzivassiloglou , Regina Barzilay , Eleazar Eskin, Towards multidocument summarization by reformulation: progress and prospects, Proceedings of the sixteenth national conference on Artificial intelligence and the eleventh Innovative applications of artificial intelligence conference innovative applications of artificial intelligence, p.453-460, July 18-22, 1999, Orlando, Florida, United States
|
| |
10
|
C. Nevill-Manning, I. Witten, and G. Paynter. Lexically-generated subject hierarchies for browsing large collections. International Journal on Digital Libraries, 2(2+3):111-123, 1999.
|
| |
11
|
Dragomir R. Radev , Hongyan Jing , Malgorzata Budzikowska, Centroid-based summarization of multiple documents: sentence extraction, utility-based evaluation, and user studies, NAACL-ANLP 2000 Workshop on Automatic summarization, p.21-30, April 30-30, 2000, Seattle, Washington
[doi> 10.3115/1117575.1117578]
|
| |
12
|
|
 |
13
|
|
| |
14
|
G. Stein, T. Strzalkowski, G. B. Wise, and A. Bagga. Evaluating summaries for multiple documents in an interactive environment. In LREC, 2000.
|
| |
15
|
|
| |
16
|
E. M. Voorhees and D. K. Harman, editors. The Sixth Text REtrieval Conference (TREC-6). Department of Commerce, National Institute of Standards and Technology, 1997.
|
 |
17
|
Ian H. Witten , Gordon W. Paynter , Eibe Frank , Carl Gutwin , Craig G. Nevill-Manning, KEA: practical automatic keyphrase extraction, Proceedings of the fourth ACM conference on Digital libraries, p.254-255, August 11-14, 1999, Berkeley, California, United States
[doi> 10.1145/313238.313437]
|
 |
18
|
|
| |
19
|
YAHOO. Yahoo. www.yahoo.com.
|
CITED BY 30
|
|
D. L. Chan , R. W. P. Luk , W. K. Mak , H. V. Leong , E. K. S. Ho , Q. Lu, Multiple related document summary and navigation using concept hierarchies for mobile clients, Proceedings of the 2002 ACM symposium on Applied computing, March 11-14, 2002, Madrid, Spain
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Krishna Kummamuru , Rohit Lotlikar , Shourya Roy , Karan Singal , Raghu Krishnapuram, A hierarchical monothetic document clustering algorithm for summarization and browsing search results, Proceedings of the 13th international conference on World Wide Web, May 17-20, 2004, New York, NY, USA
|
|
|
Hua-Jun Zeng , Qi-Cai He , Zheng Chen , Wei-Ying Ma , Jinwen Ma, Learning to cluster web search results, Proceedings of the 27th annual international ACM SIGIR conference on Research and development in information retrieval, July 25-29, 2004, Sheffield, United Kingdom
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jimmy Lin , Philip Wu , Dina Demner-Fushman , Eileen Abels, Exploring the limits of single-iteration clarification dialogs, Proceedings of the 29th annual international ACM SIGIR conference on Research and development in information retrieval, August 06-11, 2006, Seattle, Washington, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Gang Luo , Chunqiang Tang , Hao Yang , Xing Wei, MedSearch: a specialized search engine for medical information retrieval, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
|
|