|
ABSTRACT
Recent research on pattern discovery has progressed form mining frequent itemsets and sequences to mining structured patterns including trees, lattices, and graphs. As a general data structure, graph can model complicated relations among data with wide applications in bioinformatics, Web exploration, and etc. However, mining large graph patterns in challenging due to the presence of an exponential number of frequent subgraphs. Instead of mining all the subgraphs, we propose to mine closed frequent graph patterns. A graph g is closed in a database if there exists no proper supergraph of g that has the same support as g. A closed graph pattern mining algorithm, CloseGraph, is developed by exploring several interesting pruning methods. Our performance study shows that CloseGraph not only dramatically reduces unnecessary subgraphs to be generated but also substantially increases the efficiency of mining, especially in the presence of large graph patterns.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
T. Asai, K. Abe, S. Kawasoe, H. Arimura, H. Satamoto, and S. Arikawa. Efficient substructure discovery from large semi-structured data. In SDM'02.
|
| |
3
|
|
| |
4
|
|
| |
5
|
|
| |
6
|
L. Dehaspe, H. Toivonen, and R. King. Finding frequent substructures in chemical compounds. In KDD'98.
|
| |
7
|
I. Eidhammer, I. Jonassen, and W. R. Taylor Structure comparison and structure patterns. In J. of Comp. Bio. 7, 2000.
|
 |
8
|
Jiawei Han , Jian Pei , Yiwen Yin, Mining frequent patterns without candidate generation, Proceedings of the 2000 ACM SIGMOD international conference on Management of data, p.1-12, May 15-18, 2000, Dallas, Texas, United States
|
| |
9
|
L. B. Holder, D. J. Cook, and S. Djoko. Substructure discovery in the subdue system. In KDD'94.
|
| |
10
|
|
| |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
J. Pei, J. Han, and R. Mao. CLOSET: An efficient algorithm for mining frequent closed itemsets. In DMKD'00.
|
| |
15
|
|
| |
16
|
X. Yan and J. Han. gSpan: Graph-based substructure pattern mining. UIUC-CS Tech. Report: R-2002-2296, (a 4-page short version in ICDM'02).
|
| |
17
|
X. Yan, J. Han, and R. Afshar. CloSpan: Mining closed sequential patterns in large datasets. In SDM'03.
|
| |
18
|
M. J. Zaki and C. J. Hsiao. CHARM: An efficient algorithm for closed itemset mining. In SDM'02.
|
 |
19
|
|
CITED BY 62
|
|
|
|
|
Jun Huan , Wei Wang , Jan Prins , Jiong Yang, SPIN: mining maximal frequent subgraphs from graph databases, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 22-25, 2004, Seattle, WA, USA
|
|
|
Chen Wang , Wei Wang , Jian Pei , Yongtai Zhu , Baile Shi, Scalable mining of large disk-based graph databases, Proceedings of the tenth ACM SIGKDD international conference on Knowledge discovery and data mining, August 22-25, 2004, Seattle, WA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jian Pei , Jiawei Han , Behzad Mortazavi-Asl , Jianyong Wang , Helen Pinto , Qiming Chen , Umeshwar Dayal , Mei-Chun Hsu, Mining Sequential Patterns by Pattern-Growth: The PrefixSpan Approach, IEEE Transactions on Knowledge and Data Engineering, v.16 n.11, p.1424-1440, November 2004
|
|
|
|
|
|
|
|
|
Wei Wang , Chen Wang , Yongtai Zhu , Baile Shi , Jian Pei , Xifeng Yan , Jiawei Han, GraphMiner: a structural pattern-mining system for large disk-based graph databases and its applications, Proceedings of the 2005 ACM SIGMOD international conference on Management of data, June 14-16, 2005, Baltimore, Maryland
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Chao Liu , Chen Chen , Jiawei Han , Philip S. Yu, GPLAG: detection of software plagiarism by program dependence graph analysis, Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, August 20-23, 2006, Philadelphia, PA, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Nancy P. Lin , Wei-Hua Hao , Hung-Jen Chen , Hao-En Chueh , Chung-I Chang, Fast mining maximal sequential patterns, Proceedings of the 7th WSEAS International Conference on Simulation, Modelling and Optimization, p.405-408, September 15-17, 2007, Beijing, China
|
|
|
A. Dreweke , M. Worlein , I. Fischer , D. Schell , Th. Meinl , M. Philippsen, Graph-Based Procedural Abstraction, Proceedings of the International Symposium on Code Generation and Optimization, p.259-270, March 11-14, 2007
|
|
|
|
|
|
Chen Chen , Cindy Xide Lin , Xifeng Yan , Jiawei Han, On effective presentation of graph patterns: a structural representative approach, Proceeding of the 17th ACM conference on Information and knowledge management, October 26-30, 2008, Napa Valley, California, USA
|
|
|
|
|
|
|
|
|
Bingjun Sun , Prasenjit Mitra , C. Lee Giles, Mining, indexing, and searching for textual chemical molecule information on the web, Proceeding of the 17th international conference on World Wide Web, April 21-25, 2008, Beijing, China
|
|
|
Chen Chen , Xifeng Yan , Philip S. Yu , Jiawei Han , Dong-Qing Zhang , Xiaohui Gu, Towards graph containment search and indexing, Proceedings of the 33rd international conference on Very large data bases, September 23-27, 2007, Vienna, Austria
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Jian Pei , Haixun Wang , Jian Liu , Ke Wang , Jianyong Wang , Philip S. Yu, Discovering Frequent Closed Partial Orders from Strings, IEEE Transactions on Knowledge and Data Engineering, v.18 n.11, p.1467-1481, November 2006
|
|
|
Jian Pei , Haixun Wang , Jian Liu , Ke Wang , Jianyong Wang , Philip S. Yu, Discovering Frequent Closed Partial Orders from Strings, IEEE Transactions on Knowledge and Data Engineering, v.18 n.11, p.1467-1481, November 2006
|
|
|
Wei Fan , Kun Zhang , Hong Cheng , Jing Gao , Xifeng Yan , Jiawei Han , Philip Yu , Olivier Verscheure, Direct mining of discriminative and essential frequent patterns via model-based search tree, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|