|
ABSTRACT
This paper introduces a novel machine learning model called multiple instance ranking (MIRank) that enables ranking to be performed in a multiple instance learning setting. The motivation for MIRank stems from the hydrogen abstraction problem in computational chemistry, that of predicting the group of hydrogen atoms from which a hydrogen is abstracted (removed) during metabolism. The model predicts the preferred hydrogen group within a molecule by ranking the groups, with the ambiguity of not knowing which hydrogen atom within the preferred group is actually abstracted. This paper formulates MIRank in its general context and proposes an algorithm for solving MIRank problems using successive linear programming. The method outperforms multiple instance classification models on several real and synthetic datasets.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Afzelius, L., Arnby, C. H., Broo, A., Carlsson, L., Isaksson, C., Jurva, U., Kjellander, B., Kolmodin, K., Nilsson, K., Raubacher, F., & Weidolf, L. (2007). State-of-the-art tools for computational site of metabolism predictions: Comparative analysis, mechanistical insights, and future applications. Drug Metabolism Reviews, 39, 61--86.
|
| |
2
|
Andrews, S., Tsochantaridis, I., & Hofmann, T. (2003). Support vector machines for multiple-instance learning. Advances in Neural Information Processing Systems 15.
|
| |
3
|
|
| |
4
|
Guengerich, F. P. (1999). Cytochrome p-450 3A4: Regulation and role in drug metabolism. Annual Review of Pharmacology and Toxicology, 39, 1--7.
|
 |
5
|
|
| |
6
|
Mangasarian, O. L., & Wild, E. W. (2008). Multiple instance classification via successive linear programming. Journal of Optimization Theory and Applications, Accepted.
|
| |
7
|
|
| |
8
|
|
| |
9
|
Ramon, J., & Raedt, L. D. (2000). Multi instance neural networks. Proceedings of the 17th International Machine Learning Conference.
|
| |
10
|
|
| |
11
|
Rendic, S. (1997). Summary of information on human CYP enzymes: human P450 metabolism data. Drug Metabolism Reviews, 34, 83--448.
|
| |
12
|
|
| |
13
|
Sheridan, R. P., Korzekwa, K. R., Torres, R. A., & Walker, M. J. (2007). Empirical regioselectivity models for human cytochromes P450 3A4, 2D6, and 2C9. Journal of Medicinal Chemistry, 50, 3173--3184.
|
| |
14
|
Singh, S. B., Shen, L. Q., Walker, M. J., & Sheridan, R. P. (2003). A model for predicting likely sites of CYP3A4-mediated metabolism on drug-like molecules. Journal of Medicinal Chemistry, 46, 1330--1336.
|
| |
15
|
Thummel, K. E., Kunzea, K. L., & Shen, D. D. (1997). Enzyme-catalyzed processes of first-pass hepatic and intestinal drug extraction. Advanced Drug Delivery Reviews, 27, 99--127.
|
| |
16
|
Zhang, Q., & Goldman, S. A. (2001). EM-DD: An improved multiple-instance learning technique. Advances in Neural Information Processing Systems 14.
|
|