ACM Home Page
Please provide us with feedback. Feedback
Correlating summarization of multi-source news with k-way graph bi-clustering
Full text PdfPdf (167 KB)
Source ACM SIGKDD Explorations Newsletter archive
Volume 6 ,  Issue 2  (December 2004) table of contents
Pages: 34 - 42  
Year of Publication: 2004
ISSN:1931-0145
Authors
Ya Zhang  The Pennsylvania State University, PA
Chao-Hsien Chu  The Pennsylvania State University, PA
Xiang Ji  NEC Laboratories America, Cupertino, CA
Hongyuan Zha  The Pennsylvania State University, PA
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 88,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1046456.1046461
What is a DOI?

ABSTRACT

With the emergence of enormous amount of online news, it is desirable to construct text mining methods that can extract, compare and highlight similarities of them. In this paper, we explore the research issue and methodology of correlated summarization for a pair of news articles. The algorithm aligns the (sub)topics of the two news articles and summarizes their correlation by sentence extraction. A pair of news articles are modelled with a weighted bipartite graph. A mutual reinforcement principle is applied to identify a dense subgraph of the weighted bipartite graph. Sentences corresponding to the subgraph are correlated well in textual content and convey the dominant shared topic of the pair of news articles. As a further enhancement for lengthy articles, a k-way bi-clustering algorithm can first be used to partition the bipartite graph into several clusters, each containing sentences from the two news reports. These clusters correspond to shared subtopics, and the above mutual reinforcement principle can then be applied to extract topic sentences within each subtopic group.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
 
2
R. Cooley, J. Srivastava, and B. Mobasher. Web mining: Information and pattern discovery on the world wide web. pages 558--567, 1997.
3
 
4
 
5
 
6
M. Dixon. An overview of document mining technology. http://citeseer.ist.psu.edu/dixon97overview.html, 1997.
 
7
L. Ertoz, M. Steinbach, and V. Kumar. Finding topics in collections of documents: A shared nearest neighbor approach. In Text Mine '01, Workshop on Text Mining, First SIAM International Conference on Data Mining, Chicago, IL, 2001.
 
8
P. Gawrysiak. Using data mining methodology for text retrieval. In Proceedings of International Information Science and Education Conference, Gdansk, Poland, 1999.
 
9
 
10
M. Gu, H. Zha, C. Ding, X. He, and H. Simon. Spectral relaxation models and structure analysis for k-way graph clustering and bi-clustering. Technical Report CSE-01-007, Department of Computer Science and Engineering, the Pennsylvania State University, 2001.
11
12
13
 
14
S. Lawrence and C. Giles. Accessibility of information on the web. Nature, 400:107--109, 1999.
 
15
I. Mani and E. Bloedorn. Multi-document summarization by graph search and matching. In Proceedings of the Fourteenth National Conference on Artificial Intelligence (AAAI-97), pages 622--628, Providence, RI, 1997.
 
16
 
17
 
18
D. Marcu. Automatic abstracting. Encyclopedia of Library and Information Science, pages 245--256, 2003.
 
19
J. L. Neto, A. D. Santos, C. A. A. Kaestner, and A. A. Freitas. Document clustering and text summarization. In 4th International Conference on Practical Applications of Knowledge Discovery and Data Ming, London, 2000.
 
20
 
21
M. Porter. An algorithm for suffix stripping. Program, 14(3):130--137, 1980.
 
22
 
23
 
24
 
25
 
26
C. Wayne. Multilingual topic detection and tracking: Successful research enabled by corpora and evaluation. In Proceedings of Language Resources and Evaluation Conference (LREC), pages 1487--1494, 2000.

Collaborative Colleagues:
Ya Zhang: colleagues
Chao-Hsien Chu: colleagues
Xiang Ji: colleagues
Hongyuan Zha: colleagues