ACM Home Page
Please provide us with feedback. Feedback
TSCAN: a novel method for topic summarization and content anatomy
Full text PdfPdf (506 KB)
Source
Annual ACM Conference on Research and Development in Information Retrieval archive
Proceedings of the 31st annual international ACM SIGIR conference on Research and development in information retrieval table of contents
Singapore, Singapore
SESSION: Content analysis table of contents
Pages 579-586  
Year of Publication: 2008
ISBN:978-1-60558-164-4
Authors
Chien Chin Chen  National Taiwan University, Taipei, Taiwan Roc
Meng Chang Chen  Academia, Sinica, Taipei, Taiwan Roc
Sponsors
ACM: Association for Computing Machinery
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 31,   Downloads (12 Months): 380,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1390334.1390433
What is a DOI?

ABSTRACT

A topic is defined as a seminal event or activity along with all directly related events and activities. It is represented as a chronological sequence of documents by different authors published on the Internet. In this paper, we define a task called topic anatomy, which summarizes and associates core parts of a topic graphically so that readers can understand the content easily. The proposed topic anatomy model, called TSCAN, derives the major themes of a topic from the eigenvectors of a temporal block association matrix. Then, the significant events of the themes and their summaries are extracted by examining the constitution of the eigenvectors. Finally, the extracted events are associated through their temporal closeness and context similarity to form the evolution graph of the topic. Experiments based on the official TDT4 corpus demonstrate that the generated evolution graphs comprehensibly describe the storylines of topics. Moreover, in terms of content coverage and consistency, the produced summaries are superior to those of other summarization methods based on human composed reference summaries.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
 
3
Allan, J., Carbonell, J., Doddington, G., Yamron, J., and Yang, Y. 1998. Topic detection and tracking pilot study: final report. In DARPA broadcast news transcription and understanding workshop proceedings, 194--218.
 
4
5
 
6
Erkan, G. and Radev, D.R. 2004. LexRank: graph-based centrality as salience in text summarization. In journal of artificial intelligence research, 22:457--479.
7
8
9
 
10
11
12
 
13
Nenkova, A. 2005. Automatic text summarization of newswire: lessons learned from the document understanding conference. In AAAI05 conference proceedings, 1436--1441.
14
 
15
Rabiner, L.R. and Sambur, M.R. 1975. An algorithm for determining the endpoints for isolated utterances. In technical journal of Bell system, 54(2):297--315.
 
16
Spence, L.E., Insel, A.J., and Friedberg, S.H. 2000. Elementary linear algebra, a matrix approach. Prentice Hall.
 
17
Winston, W.L. 2004. Operations research. Thomson.
18
19
20

Collaborative Colleagues:
Chien Chin Chen: colleagues
Meng Chang Chen: colleagues