ACM Home Page
Please provide us with feedback. Feedback
On effective presentation of graph patterns: a structural representative approach
Full text PdfPdf (289 KB)
Source
Conference on Information and Knowledge Management archive
Proceeding of the 17th ACM conference on Information and knowledge management table of contents
Napa Valley, California, USA
SESSION: KM: link and graph mining table of contents
Pages 299-308  
Year of Publication: 2008
ISBN:978-1-59593-991-3
Authors
Chen Chen  University of Illinois, Urbana, IL, USA
Cindy Xide Lin  University of Illinois, Urbana, IL, USA
Xifeng Yan  IBM T. J. Watson Research Center, Hawthorne, NY, USA
Jiawei Han  University of Illinois, Urbana, IL, USA
Sponsors
ACM: Association for Computing Machinery
SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web
SIGIR: ACM Special Interest Group on Information Retrieval
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 14,   Downloads (12 Months): 165,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1458082.1458124
What is a DOI?

ABSTRACT

In the past, quite a few fast algorithms have been developed to mine frequent patterns over graph data, with the large spectrum covering many variants of the problem. However, the real bottleneck for knowledge discovery on graphs is neither efficiency nor scalability, but the usability of patterns that are mined out. Currently, what the state-of-art techniques give is a lengthy list of exact patterns, which are undesirable in the following two aspects: (1) on the micro side, due to various inherent noises or data diversity, exact patterns are usually not too useful in many real applications; and (2) on the macro side, the rigid structural requirement being posed often generates an excessive amount of patterns that are only slightly different from each other, which easily overwhelm the users.

In this paper, we study the presentation problem of graph patterns, where structural representatives are deemed as the key mechanism to make the whole strategy effective. As a solution to fill the usability gap, we adopt a two-step smoothing-clustering framework, with the first step adding error tolerance to individual patterns (the micro side), and the second step reducing output cardinality by collapsing multiple structurally similar patterns into one representative (the macro side). This novel, integrative approach is never tried in previous studies, which essentially rolls-up our attention to a more appropriate level that no longer looks into every minute detail. The above framework is general, which may apply under various settings and incorporate a lot of extensions. Empirical studies indicate that a compact group of informative delegates can be achieved on real datasets and the proposed algorithms are both efficient and scalable.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
2
 
3
 
4
 
5
 
6
 
7
 
8
9
 
10
L. Kaufman and P. J. Rousseeuw, editors. Finding Groups in Data: an Introduction to Cluster Analysis. John Wiley and Sons, 1990.
 
11
 
12
M. Kuramochi and G. Karypis. Finding frequent patterns in a large sparse graph. In SDM, 2004.
 
13
Y. Liu, J. Li, and H. Gao. Summarizing graph patterns. In ICDE, pages 903--912, 2008.
 
14
T. Mielikäinen and H. Mannila. The pattern ordering problem. In PKDD, pages 327--338, 2003.
 
15
16
 
17
 
18
R. Sharan, S. Suthram, R. M. Kelley, T. Kuhn, S. McCuine, P. Uetz, T. Sittler, R. M. Karp, and T. Ideker. Conserved patterns of protein interaction in multiple species. PNAS, 102(6):1974--1979, 2005.
19
 
20
21
22
23

Collaborative Colleagues:
Chen Chen: colleagues
Cindy Xide Lin: colleagues
Xifeng Yan: colleagues
Jiawei Han: colleagues