ACM Home Page
Please provide us with feedback. Feedback
Molecular feature mining in HIV data
Full text PdfPdf (679 KB)
Source International Conference on Knowledge Discovery and Data Mining archive
Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining table of contents
San Francisco, California
Pages: 136 - 143  
Year of Publication: 2001
ISBN:1-58113-391-X
Authors
Stefan Kramer  Institute for Computer Science, Machine Learning Lab, Albert-Ludwigs-University Freiburg, Georges,Köhler-Allee Geb. 79, D-79110 Freiburg/Br., Germany
Luc De Raedt  Institute for Computer Science, Machine Learning Lab, Albert-Ludwigs-University Freiburg, Georges,Köhler-Allee Geb. 79, D-79110 Freiburg/Br., Germany
Christoph Helma  Institute for Computer Science, Machine Learning Lab, Albert-Ludwigs-University Freiburg, Georges,Köhler-Allee Geb. 79, D-79110 Freiburg/Br., Germany
Sponsors
SIGMOD: ACM Special Interest Group on Management of Data
AAAI : American Association for Artificial Intelligence
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 12,   Downloads (12 Months): 64,   Citation Count: 31
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/502512.502533
What is a DOI?

ABSTRACT

We present the application of Feature Mining techniques to the Developmental Therapeutics Program's AIDS antiviral screen database. The database consists of 43576 compounds, which were measured for their capability to protect human cells from HIV-1 infection. According to these measurements, the compounds were classified as either active, moderately active or inactive. The distribution of classes is extremely skewed: Only 1.3 % of the molecules is known to be active, and 2.7 % is known to be moderately active.Given this database, we were interested in molecular substructures (i.e., features) that are frequent in the active molecules, and infrequent in the inactives. In data mining terms, we focused on features with a minimum support in active compounds and a maximum support in inactive compounds. We analyzed the database using the levelwise version space algorithm that forms the basis of the inductive query and database system MOLFEA (Molecular Feature Miner). Within this framework, it is possible to declaratively specify the features of interest, such as the frequency of features on (possibly different) datasets as well as on the generality and syntax of them. Assuming that the detected substructures are causally related to biochemical mechanisms, it should be possible to facilitate the development of new pharmaceuticals with improved activities.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

1
2
 
3
L. Dehaspe, H. Toivonen, R.D. King. Finding frequent substructures in chemical compounds, in: Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98), 30-36, AAAI press, 1998.
 
4
 
5
 
6
L. De Raedt, S. Kramer. The levelwise version space algorithm and its application to molecular fragment finding, in: Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence (IJCAI-01), 2001.
 
7
 
8
 
9
10
 
11
C.A. James, D. Weininger, J. Delany. Daylight theory manual - Daylight J. 71, Daylight Chemical Information Systems, 2000. http ://www. daylight, corn/
 
12
 
13
 
14
 
15
 
16
 
17
T.M. Mitchell. Generalization as search, Artificial Intelligence, 18(2), 1982.
 
18
 
19
 
20
 
21
D. Weininger. SMILES 1. Introduction and encoding rules. Journal of Chemical Information and Computer Sciences, 28, 31, 1988.
 
22
D. Weininger, A. Weininger, J.L Weininger. SMILES II, algorithm for generation of unique SMILES notation. Journal of Chemical Information and Computer Sciences, 29, 97, 1989.
 
23
Weislow, O.S., R. Kiser, D.L. Fine, J.P. Bader, R.H. Shoemaker, M.K. Boyd. New soluble formazan assay for HIV-1 cytopathic effects: application to high flux screening of synthetic and natural products for AIDS antiviral activity. Journal of the National Cancer Institute, 81:577-586, 1989.

CITED BY  31

Collaborative Colleagues:
Stefan Kramer: colleagues
Luc De Raedt: colleagues
Christoph Helma: colleagues