|
ABSTRACT
We introduce a family of kernels on discrete data structures within the general class of decomposition kernels. A weighted decomposition kernel (WDK) is computed by dividing objects into substructures indexed by a selector. Two substructures are then matched if their selectors satisfy an equality predicate, while the importance of the match is determined by a probability kernel on local distributions fitted on the substructures. Under reasonable assumptions, a WDK can be computed efficiently and can avoid combinatorial explosion of the feature space. We report experimental evidence that the proposed kernel is highly competitive with respect to more complex state-of-the-art methods on a set of problems in bioinformatics.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Collins, M., & Duffy, N. (2001). Convolution kernels for natural language. NIPS 14 (pp. 625--632).
|
| |
3
|
|
| |
4
|
Cumby, C. M., & Roth, D. (2003). On kernel methods for relational learning. Proceedings of ICML'03.
|
| |
5
|
|
 |
6
|
|
| |
7
|
Gärtner, T., Flach, P., & Wrobel, S. (2003). On graph kernels: Hardness results and efficient alternatives. Proc. of COLT/Kernel '03 (pp. 129--143).
|
| |
8
|
Haussler, D. (1999). Convolution kernels on discrete structures (Technical Report UCSC-CRL-99-10). University of California, Santa Cruz.
|
| |
9
|
Helma, C., King, R. D., Kramer, S., & Srinivasan, A. (2001). The Predictive Toxicology Challenge 2000--2001. Bioinformatics, 17, 107--108.
|
 |
10
|
|
| |
11
|
Hua, S., & Sun, Z. (2001). Support Vector Machine for Protein Subcellular Localization Prediction. Bioinformatics, 17, 721--728.
|
| |
12
|
Jaakkola, T., Diekhans, M., & Haussler, D. (2000). A Discriminative Framework for Detecting Remote Protein Homologies. J. of Comp. Biology, 7, 95--114.
|
| |
13
|
|
| |
14
|
Kashima, H., Tsuda, K., & Inokuchi, A. (2003). Marginalized kernels between labeled graphs. Proceedings of ICML'03 (pp. 321--328).
|
 |
15
|
|
| |
16
|
Leslie, C. S., Eskin, E., & Noble, W. S. (2002). The spectrum kernel: A string kernel for SVM protein classification. Pacific Symposium on Biocomputing (pp. 566--575).
|
| |
17
|
|
| |
18
|
Mahé, P., Ueda, N., Akutsu, T., Perret, J.-L., & Vert, J.-P. (2004). Extensions of marginalized graph kernels. Proceedings of ICML'04 (pp. 552--559).
|
| |
19
|
Nair, R., & Rost, B. (2003). Better Prediction of Sub-Cellular Localization by Combining Evolutionary and Structural Information. Proteins: Structure, Function, and Genetics, 53, 917--930.
|
| |
20
|
Odone, F., Barla, A., & Verri, A. (2005). Building kernels from binary strings for image matching. IEEE Transactions on Image Processing, 14, 169--180.
|
| |
21
|
|
| |
22
|
|
|