ACM Home Page
Please provide us with feedback. Feedback
Chemoinformatics—an introduction for computer scientists
Full text PdfPdf (1.71 MB)
Source
ACM Computing Surveys (CSUR) archive
Volume 41 ,  Issue 2  (February 2009) table of contents
Article No. 8  
Year of Publication: 2009
ISSN:0360-0300
Author
Nathan Brown  Novartis Institutes for BioMedical Research, Surrey, U.K.
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 95,   Downloads (12 Months): 1774,   Citation Count: 0
Additional Information:

abstract   references   index terms  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1459352.1459353
What is a DOI?

ABSTRACT

Chemoinformatics is an interface science aimed primarily at discovering novel chemical entities that will ultimately result in the development of novel treatments for unmet medical needs, although these same methods are also applied in other fields that ultimately design new molecules. The field combines expertise from, among others, chemistry, biology, physics, biochemistry, statistics, mathematics, and computer science. In this general review of chemoinformatics the emphasis is placed on describing the general methods that are routinely applied in molecular discovery and in a context that provides for an easily accessible article for computer scientists as well as scientists from other numerate disciplines.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Adam, D. 2002. Chemists synthesize a single naming system. Nature 417, 369.
 
2
Bajorath, J., Ed. 2004. Chemoinformatics: Concepts, Methods and Tools for Drug Discovery. Humana Press, Totowa, NJ.
 
3
Balaban, A. T. 1985. Applications of graph theory in chemistry. J. Chem. Inf. Comput. Sci. 25, 334--343.
 
4
Barnard, J. M. and Downs, G. M. 1992. Clustering of chemical structures on the basis of two-dimensional similarity measures. J. Chem. Inf. Comput. Sci. 32, 644--649.
 
5
Bauerschmidt, S. and Gasteiger, J. 1997. Overcoming the limitations of a connection table description: A universal representation of chemical species. J. Chem. Inf. Comput. Sci. 37, 705--714.
 
6
Bemis, G. W. and Murcko, M. A. 1996. The properties of known drugs. 1. Molecular frameworks. J. Med. Chem. 39, 2887--2893.
 
7
Bender, A. and Glen, R. C. 2004. Molecular similarity: A key technique in molecular informatics. Org. Biomol. Chem. 2, 3204--3218.
 
8
Böhm, H.-J., Flohr, A., and Stahl, M. 2004. Scaffold hopping. Drug Discov. Today: Tech. 1, 217--224.
 
9
Brooijmans, N. and Kuntz, I. D. 2003. Molecular recognition and docking algorithms. Ann. Rev. Biophys. Biomol. Struct. 32, 335--373.
 
10
Brown, F. K. 1998. Chemoinformatics: What is it and how does it impact drug discovery? Ann. Rep. Med. Chem. 33, 375--384.
 
11
Brown, N. and Jacoby, E. 2006. On scaffolds and hopping in medicinal chemistry. Mini Rev. Med. Chem. 6, 1217--1229.
 
12
Brown, N. and Lewis, R. A. 2006. Exploiting QSAR methods in lead optimization. Curr. Opin. Drug Discov. Devel. 9, 419--424.
 
13
Brown, N., McKay, B., and Gasteiger, J. 2005. Fingal: A novel approach to geometric fingerprinting and a comparative study of its application to 3D QSAR modelling. QSAR Comb. Sci. 24, 480--484.
 
14
Brown, N., McKay, B., Gilardoni, F., and Gasteiger, J. 2004. A graph-based genetic algorithm and its application to the multiobjective evolution of median molecules. J. Chem. Inf. Comput. Sci. 44, 1079--1087.
 
15
Brown, R. D. and Martin, Y. C. 1997. The information content of 2D and 3D structural descriptors relevant to ligand-receptor binding. J. Chem. Inf. Comput. Sci. 37, 1--9.
 
16
Cechetto, J. D., Elowe, N. H., Blanchard, J. E., and Brown, E. D. 2004. High-throughput screening at McMaster University: Automation in academe. J. Assoc. Lab. Auto. 9, 307--311.
17
 
18
Coles, S. J., Day, N. E., Murray-Rust, P., Rzepa, H. S., and Zhang, Y. 2005. Enhancement of the chemical semantic web through the use of InChI identifiers. Org. Biomol. Chem. 3, 1832--1834.
 
19
Corey, E. J. and Cheng, X.-M. 1995. The Logic of Chemical Synthesis. Wiley, New York, NY.
 
20
Cramer, R. D., III., Patterson, D. E., and Bunce, J. D. 1988. Comparative molecular field analysis (CoMFA). 1. Effect of shape on binding of steroids to carried proteins. J. Amer. Chem. Soc. 110, 5959--5967.
 
21
Crum Brown, A. 1864. On the theory of isomeric compounds. Trans. Roy. Soc. Edinb. 23, 707--719.
 
22
Crum Brown, A. and Fraser, T. R. 1869. V.—On the connection between chemical constitution and physiological action. Part. I.—On the physiological action of the salts of the ammonium bases, derived from strychnia, brucia, thebaia, codeia, morphia, and nicotia. Trans. Roy. Soc. Edinb. 25, 151--203.
 
23
Diestel, R. 2000. Graph Theory, 2nd Ed. Springer-Verlag, New York, NY.
 
24
Dimasi, J. A., Hansen, R. W., and Grabowski, H. G. 2003. The price of innovation: New estimates of drug development costs. J. Health Econ. 22, 151--185.
 
25
Durant, J. L., Leland, B. A., Henry, D. R., and Nourse, J. G. 2002. Reoptimization of MDL keys for use in drug discovery. J. Chem. Inf. Comput. Sci. 42, 1273--1280.
 
26
Eriksson, L., Arnhold, T., Beck, B., Fox, T., Johansson, E., and Kriegl, J. M. 2004. Onion design and its application to a pharmaceutical QSAR problem. J. Chemomet. 18, 188--202.
 
27
Eriksson, L., Jaworska, J., Worth, A. P., Cronin, M. T. D., and McDowell, R. M. 2003. Methods for reliability and uncertainty assessment and for applicability evaluations of classification- and regression-based QSARs. Environ. Health Perspect. 111, 1361--1375.
 
28
Ertl, P. 2007. In silico identification of bioisosteric functional groups. Curr. Opin. Drug Discov. Devel. 10, 281--288.
 
29
Ferrara, P., Priestle, J. P., Vangrevelinghe, E., and Jacoby, E. 2006. New developments and applications of docking and high-throughput docking for drug design and in silico screening. Curr. Comp.-Aided Drug Des. 2, 83--91.
 
30
Fujita, T., Iwasa, J., and Hansch, C. 1964. A new substituent constant, π, derived from partition coefficients. J. Amer. Chem. Soc. 86, 5175--5180.
 
31
Gasteiger, J., Ed. 2003. The Handbook of Chemoinformatics. Wiley-VCH, Weinheim, Germany.
 
32
Gasteiger, J. and Engel, T., Eds. 2003. Chemoinformatics: A Textbook. Wiley-VCH, Weinheim, Germany.
 
33
Gasteiger, J., Pförtner, M., Sitzmann, M., Höllering, R., Sacher, O., Kostka, T., and Karg, N. 2000. Computer-assisted synthesis and reaction planning in combinatorial chemistry. Persp. Drug Discov. Des. 20, 1--21.
 
34
Gasteiger, J., Rudolph, C., and Sadowski, J. 1990. Automatic generation of 3D atomic coordinates for organic molecules. Tetrahed. Comput. Methodol. 3, 537--547.
 
35
 
36
Gillet, V. J., Willett, P., Bradshaw, J., and Green, D. V. S. 1999. Selecting combinatorial libraries to optimize diversity and physical properties. J. Chem. Inf. Comput. Sci. 39, 169--177.
 
37
Goldberg, K., Newman, M., and Haynsworth, E. 1972. Combinatorial Analysis. In Handbook of Mathematical Functions With Formulas, Graphs, and Mathematical Tables, 10th ed. Abramowitz, M., Stegun, I. A. Eds. U.S. Government Printing Office: Washington, DC, 824--825.
 
38
Gorse, A.-D. 2006. Diversity in medicinal chemistry space. Curr. Top. Med. Chem. 6, 3--18.
 
39
Gund, P. 1979. Pharmacophoric pattern searching and receptor mapping. Ann. Rep. Med. Chem. 14, 299--308.
 
40
Güner, O. F. 2005. The impact of pharmacophore modeling in drug design. IDrugs 8, 567--572.
 
41
Hastie, T., Tibshirani, R., and Friedman, J. 2001. The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Springer-Verlag, New York, NY.
 
42
Hert, J., Willett, P., Wilton, D. J., Acklin, P., Azzaoui, K., Jacoby, E., and Schuffenhauer, A. 2004. Comparison of fingerprint-based methods for virtual screening using multiple bioactive reference structures. J. Chem. Inf. Comput. Sci. 44, 1177--1185.
 
43
Johnson, M. A. and Maggiora, G. M. Eds. 1990. Concepts and Applications of Molecular Similarity. Wiley Inter-Science, New York, NY.
 
44
Jones, G., Willett, P., Glen, R. C., Leach, A. R., and Taylor, R. 1997. Development and validation of a genetic algorithm for flexible docking. J. Mol. Biol. 267, 727--748.
 
45
Karelson, M. 2000. Molecular Descriptors in QSAR/QSPR. Wiley-VCH, Weinheim, Germany.
 
46
Kitchen, D. B., Decornez, H., Furr, J. R., and Bajorath, J. 2004. Docking and scoring in virtual screening for drug discovery: Methods and applications. Nature Rev. Drug Discov. 3, 935--949.
 
47
Kuntz, I. D., Blaney, J. M., Oatley, S. J., Langridge, R., and Ferrin, T. E. 1982. A geometric approach to macromolecule-ligand interactions. J. Mol. Biol. 161, 269--288.
 
48
Leach, A. R. 2001. Molecular Modelling: Principles and Applications, 2nd ed. Prentice Hall, Harlow, U.K.
 
49
Leach, A. R. and Gillet, V. J. 2003. An Introduction to Chemoinformatics. Kluwer Academic Publishers, Dordrecht, The Netherlands.
 
50
Lewell, X. Q., Judd, D. B., Watson, S. P., and Hann, M. M. 1998. RECAP—retrosynthetic analysis procedure: A powerful new technique for identifying privileged molecular fragments with useful applications in combinatorial chemistry. J. Chem. Inf. Comput. Sci. 38, 511--522.
 
51
Lipinski, C. A., Lombardo, F., Dominy, B. W., and Feeney, P. J. 2001. Experimental and computational approaches to estimate solubility and permeability in drug discovery and development settings. Adv. Drug Deliv. Rev. 46, 3--26.
 
52
Livingstone, D. J. 2000. The characterization of chemical structures using molecular properties. A survey. J. Chem. Inf. Comput. Sci. 40, 195--209.
 
53
Lynch, M. F. 2004. Introduction of computers in chemical structure information systems, or what is not recorded in the annals. In The History and Heritage of Scientific and Technological Information Systems: Proceedings of the 2002 Conference, W. B. Rayward and M. E. Bowden, Eds. Information Today, Inc., Medford, NJ, 137--148.
 
54
Markush, E. A. 1924. Pyrazolone dye and process of making the same. U.S. Patent No. 1,506,316, August 26.
 
55
Migliavacca, E. 2003. Applied introduction to multivariate methods used in drug discovery. Mini Rev. Med. Chem. 3, 831--843.
 
56
Morgan, H. L. 1965. The generation of a unique machine description for chemical structures—a technique developed at chemical abstracts service. J. Chem. Doc. 5, 107--113.
 
57
Nicolaou, C. A., Brown, N., and Pattichis, C. S. 2007. Molecular optimization using multi-objective methods. Curr. Opin. Drug Discov. Devel. 10, 316--324.
 
58
Oprea, T. Ed. 2005a. Chemoinformatics in Drug Discovery. Wiley-VCH, Weinheim, Germany.
 
59
Oprea, T. 2005b. Is safe exchange of data possible? Chem. Eng. News 83, 24--29.
 
60
Pearlman, R. S. 1987. Rapid generation of high quality approximate 3D molecular structures. Chem. Des. Automa. News 2, 5--7.
 
61
Raevsky, O. A. 2004. Physicochemical descriptors in property-based drug design. Mini Rev. Med. Chem. 4, 1041--1052.
 
62
Reich, H. J. and Cram, D. J. 1969. Macro rings. XXXVII. Multiple electrophilic substitution reactions of {2,2}paracyclophanes and interconversions of polysubstituted derivatives. J. Am. Chem. Soc. 91, 3527--3533.
 
63
Rogers, D., Brown, R. D., and Hahn, M. 2005. Using extended-connectivity fingerprints with Laplacian-modified Bayesian analysis in high-throughput screening follow-up. J. Biomol. Screen. 10, 682--686.
 
64
Russo, E. 2002. Chemistry plans a structural overhaul. Nature Jobs 419, 4--7.
 
65
Schneider, G. and Fechner, U. 2005. Computer-based de novo design of drug-like molecules. Nature Rev. Drug Discov. 4, 649--663.
 
66
Schuffenhauer, A. and Brown, N. 2006. Chemical diversity and biological activity. Drug Discov. Today: Technol. 3, 387--395.
 
67
Schuffenhauer, A., Brown, N., Selzer, P., Ertl, P., and Jacoby, E. 2006. Relationships between molecular complexity, biological activity, and structural activity. J. Chem. Inf. Mod. 46, 525--535.
 
68
Schuffenhauer, A., Floersheim, P., Acklin, P., and Jacoby, E. 2003. Similarity metrics for ligands reflecting the similarity of the target proteins. J. Chem. Inf. Comput. Sci. 43, 391--405.
 
69
Schuffenhauer, A., Brown, N., Ertl, P., Jenkins, J. L., Selzer, P., and Hamon, J. 2007. Clustering and rule-based classifications of chemical structures evaluated in the biological activity space. J. Chem. Inf. Mod. 47, 325--336.
 
70
Snarey, M., Terrett, N. K., Willett, P., and Wilton, D. J. 1997. Comparison of algorithms for dissimilarity-based compound selection. J. Mol. Graph. Mod. 15, 372--385.
 
71
Todeschini, R. and Consonni, V. 2000. Handbook of Molecular Descriptors. Wiley-VCH, Weinheim, Germany.
 
72
 
73
 
74
Willett, P. 2000. Textual and chemical information processing: Different domains but similar algorithms. Inform. Res. 5, http://informationr.net/ir/5-2/paper69.html.
 
75
Willett, P., Barnard, J. M., and Downs, G. M. 1998. Chemical similarity searching. J. Chem. Inf. Comput. Sci. 38, 983--996.