|
ABSTRACT
With applications in biology, the world-wide web, and several other areas, mining of graph-structured objects has received significant interest recently. One of the major research directions in this field is concerned with predictive data mining in graph databases where each instance is represented by a graph. Some of the proposed approaches for this task rely on the excellent classification performance of support vector machines. To control the computational cost of these approaches, the underlying kernel functions are based on frequent patterns. In contrast to these approaches, we propose a kernel function based on a natural set of cyclic and tree patterns independent of their frequency, and discuss its computational aspects. To practically demonstrate the effectiveness of our approach, we use the popular NCI-HIV molecule dataset. Our experimental results show that cyclic pattern kernels can be computed quickly and offer predictive performance superior to recent graph kernels based on frequent patterns.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Rakesh Agrawal , Hiekki Mannila , Ramakrishnan Srikant , Hannu Toivonen , A. Inkeri Verkamo, Fast discovery of association rules, Advances in knowledge discovery and data mining, American Association for Artificial Intelligence, Menlo Park, CA, 1996
|
| |
2
|
|
| |
3
|
Asai, Arimura, Uno, and Nakano. Discovering frequent substructures in large unordered trees. In Proc. of the 6th International Conference on Discovery Science, volume 2843 of LNAI, pages 47--61. Springer Verlag, 2003.
|
| |
4
|
|
 |
5
|
Bernhard E. Boser , Isabelle M. Guyon , Vladimir N. Vapnik, A training algorithm for optimal margin classifiers, Proceedings of the fifth annual workshop on Computational learning theory, p.144-152, July 27-29, 1992, Pittsburgh, Pennsylvania, United States
[doi> 10.1145/130385.130401]
|
| |
6
|
M. Collins and N. Duffy. Convolution kernels for natural language. In T. G. Dietterich, S. Becker, and Z. Ghahramani, editors, Advances in Neural Information Processing Systems, volume 14. MIT Press, 2002.
|
| |
7
|
C. Cortes, P. Haffner, and M. Mohri. Positive definite rational kernels. In Learning Theory and Kernel Machines, 16th Annual Conference on Learning Theory and 7th Kernel Workshop, Proceedings. volume 2843 of LNAI, pages 41--56. Springer Verlag, 2003.
|
| |
8
|
M. Deshpande, M. Kuramochi, and G. Karypis. Automated approaches for classifying structures. In Proc. of the 2nd ACM SIGKDD Workshop on Data Mining in Bioinformatics, 2002.
|
| |
9
|
|
| |
10
|
R. Diestel. Graph theory. 2nd edition, Springer Verlag, 2000.
|
| |
11
|
H.-D. Ebbinghaus and J. Flum. Finite Model Theory. Perspectives in Mathematical Logic. 2nd edition, Springer Verlag, 1999.
|
| |
12
|
|
| |
13
|
T. Gärtner. Exponential and geometric kernels for graphs. In NIPS Workshop on Unreal Data: Principles of Modeling Nonvectorial Data, 2002.
|
| |
14
|
T. Gärtner, K. Driessens, and J. Ramon. Graph kernels and gaussian processes for relational reinforcement learning. In Proc. of the 13th International Conference on Inductive Logic Programming, volume 2835 of LNAI, pages 146--163. Springer Verlag, 2003.
|
| |
15
|
T. Gärtner, P. A. Flach, and S. Wrobel. On graph kernels: Hardness results and efficient alternatives. In Learning Theory and Kernel Machines, 16th Annual Conference on Learning Theory and 7th Kernel Workshop, Proceedings. volume 2843 of LNAI, pages 129--143. Springer Verlag, 2003.
|
| |
16
|
T. Graepel. PAC-Bayesian Pattern Classification with Kernels. PhD thesis, TU Berlin, 2002.
|
| |
17
|
D. Haussler. Convolution kernels on discrete structures. Technical report, Department of Computer Science, University of California at Santa Cruz, 1999.
|
| |
18
|
|
| |
19
|
H. Kashima, K. Tsuda, and A. Inokuchi. Marginalized kernels between labeled graphs. In Proc. of the 20th International Conference on Machine Learning, pages 321--328. AAAI Press, 2003.
|
 |
20
|
|
 |
21
|
Ravi Kumar , Prabhakar Raghavan , Sridhar Rajagopalan , D. Sivakumar , Andrew Tompkins , Eli Upfal, The Web as a graph, Proceedings of the nineteenth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, p.1-10, May 15-18, 2000, Dallas, Texas, United States
[doi> 10.1145/335168.335170]
|
| |
22
|
|
| |
23
|
C. Leslie and R. Kuang. Fast kernels for inexact string matching. In Learning Theory and Kernel Machines, 16th Annual Conference on Learning Theory and 7th Kernel Workshop, Proceedings. volume 2843 of LNAI, pages 114--128. Springer Verlag, 2003.
|
| |
24
|
H. Lodhi, J. Shawe-Taylor, N. Christianini, and C. Watkins. Text classification using string kernels. In T. Leen, T. Dietterich, and V. Tresp, editors, Advances in Neural Information Processing Systems, volume 13. MIT Press, 2001.
|
| |
25
|
|
| |
26
|
|
| |
27
|
R. C. Read and R. E. Tarjan. Bounds on backtrack algorithms for listing cycles, paths, and spanning trees. Networks, 5(3):237--252, 1975.
|
| |
28
|
B. Schölkopf and A. J. Smola. Learning with Kernels. MIT Press, 2002.
|
| |
29
|
R. Tarjan. Depth-first search and linear graph algorithms. SIAM Journal on Computing, 1(2):146--160, 1972.
|
| |
30
|
L. G. Valiant. The complexity of enumeration and reliability problems. SIAM Journal on Computing, 8(3):410--421, 1979.
|
| |
31
|
|
| |
32
|
P. Vismara. Union of all the minimum cycle bases of a graph. The Electronic Journal of Combinatorics, 4(1):73--87, 1997.
|
| |
33
|
C. Watkins. Kernels from matching operations. Technical report, Department of Computer Science, Royal Holloway, University of London, 1999.
|
 |
34
|
|
CITED BY 12
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
Wei Fan , Kun Zhang , Hong Cheng , Jing Gao , Xifeng Yan , Jiawei Han , Philip Yu , Olivier Verscheure, Direct mining of discriminative and essential frequent patterns via model-based search tree, Proceeding of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, August 24-27, 2008, Las Vegas, Nevada, USA
|
|
|
|
|
|
|
|
|
Luh Yen , Francois Fouss , Christine Decaestecker , Pascal Francq , Marco Saerens, Graph nodes clustering with the sigmoid commute-time kernel: A comparative study, Data & Knowledge Engineering, v.68 n.3, p.338-361, March, 2009
|
|