|
ABSTRACT
Automated reaction mapping is a fundamental first step in the analysis of chemical reactions and opens the door to the development of sophisticated chemical kinetic tools. This article formulates the reaction mapping problem as an optimization problem. The problem is shown to be NP-Complete for general graphs. Five algorithms based on canonical graph naming and enumerative combinatoric techniques are developed to solve the problem. Unlike previous formulations based on limited configurations or classifications, our algorithms are uniquely capable of mapping any reaction that can be represented as a set of chemical graphs optimally. This is due to the direct use of Graph Isomorphism as the basis for these algorithms as opposed to the more commonly used Maximum Common Subgraph. Experimental results on chemical and biological reaction databases demonstrate the efficiency of our algorithms.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
|
| |
2
|
Akutsu, T. 2004. Efficient extraction of mapping rules of atoms from enzymatic reaction data. Journal of Computational Biology 11, 449--462.
|
| |
3
|
Apostolakis, J., Sacher, O., Korner, R., and Gasteiger, J. 2008. Automatic determination of reaction mappings and reaction center information. 2. validation on a biochemical reaction database. Journal of Chemical Information and Modeling 48, 6, 1190--1198.
|
 |
4
|
|
| |
5
|
|
| |
6
|
Blower, P. E. and Dana, R. C. 1986. Creation of a chemical reaction database from the primary literature. In Modern Approaches to Chemical Reaction Searching, P. Willett, Ed. Gower, 146--164.
|
| |
7
|
Chen, W., Huang, J., and Gilson, M. K. 2004. Identification of symmetries in molecules and complexes. Journal of Chemical Information and Modeling 44, 4, 1301--1313.
|
| |
8
|
Cieplak, T. and Wisniewski, J. 2001. A new effective algorithm for the unambiguous identification of the stereochemical characteristics of compounds during their registration in databases. Molecules 6, 915--926.
|
| |
9
|
Conte, D., Foggia, P., Sansone, C., and Vento, M. 2004. Thirty years of graph matching in pattern recognition. International Journal of Pattern Recognition and Artificial Intelligence, 18, 3, 265--298.
|
 |
10
|
|
| |
11
|
Curran, H. J., Gaffuri, P., Pitz, W. J., and Westbrook, C. K. 1998. A comprehensive modeling study of n-heptane oxidation. Combustion and Flame 114, 1-2 (July), 149--177.
|
| |
12
|
Faulon, J.-L. 1998. Isomorphism, automorphism partitioning, and canonical labeling can be solved in polynomial-time for molecular graphs. Journal of Chemical Information and Modeling 38, 3, 432--444.
|
| |
13
|
Faulon, J.-L. 2007a. http://www.cs.sandia.gov/~jfaulon/QSAR/downloads.html.
|
| |
14
|
Faulon, J.-L. 2007b. Private email conversation with regarding time complexity of author's algorithms.
|
| |
15
|
Faulon, J.-L., Collins, M. J., and Carr, R. D. 2004. The signature molecular descriptor. 4. canonizing molecules using extended valence sequences. Journal of Chemical Information and Modeling 44, 2, 427--436.
|
| |
16
|
Felix, L. and Valiente, G. 2005. Efficient validation of metabolic pathway databases. In Proc. 6th Int. Symp. Computational Biology and Genome Informatics. 1209--1212.
|
| |
17
|
|
| |
18
|
Gasteiger, J. and Engel, T., Eds. 2003. Chemoinformatics: A Textbook. Wiley.
|
| |
19
|
Goto, S., Nishioka, T., and Kanehisa, M. 1998. Ligand: chemical database for enzyme reactions. Bioinformatics 14, 7, 591--599.
|
| |
20
|
Hoffmann, C. M. 1982. Group-Theoretic Algorithms and Graph Isomorphism. Springer-Verlag Berlin and Heidelberg GmbH and Co. KG.
|
 |
21
|
|
| |
22
|
Institute, G. R. 2006. Gri-mech 3.0. http://www.me.berkeley.edu/gri-mech/.
|
| |
23
|
Jochum, C., Gasteiger, J., and Ugi, I. 1980. The principle of minimal chemical distance (pmcd). Angew. Chem., Int. Ed. Engl. 19, 495--505.
|
| |
24
|
Jochum, C., Gasteiger, J., Ugi, I., and Dugundji, J. 1982. The principle of minimal chemical distance and the principle of minimum structure change. Zeitschrift fuer Naturforschung 37B, 9, 1205--1215.
|
| |
25
|
|
| |
26
|
Korner, R. and Apostolakis, J. 2008. Automatic determination of reaction mappings and reaction center information. 1. the imaginary transition state energy approach. Journal of Chemical Information and Modeling 48, 6, 1181--1189.
|
| |
27
|
Kvasnicka, V. 1984. Mathematical model of organic history. iv. classification scheme of chemical reactions. Collection of Czechoslovak Chemical Communications 49, 5, 1090--1097.
|
| |
28
|
Kvasnicka, V., Kratochvil, M., and Koca, J. 1983. Mathematical model of organic history. iii. reactions graphs. Collection of Czechoslovak Chemical Communications 48, 8, 2284--2304.
|
| |
29
|
Kvasnicka, V. and Pospichal, J. 1991. Chemical and reaction metrics for graph-theoretical model of organic chemistry. THEOCHEM 73, 17--42.
|
| |
30
|
Luks, E. 1982. Isomorphism of graphs of bounded valence can be tested in polynomial time. Journal of Computer and System Sciences 25, 1 (August).
|
| |
31
|
Lynch, M. F. and Willett, P. 1978. The automatic detection of chemical reaction sites. J. Chem. Inf. Compu. Sci. 18, 3, 154--159.
|
| |
32
|
McGregor, J. J. 1982. Backtrack search algorithms and the maximal common subgraph problem. Software Practice and Experience 12, 12, 23--34.
|
| |
33
|
McGregor, J. J. and Willett, P. 1981. Use of a maximal common subgraph algorithm in the automatic identification of the ostensible bond changes occurring in chemical reactions. J. Chem. Inf. Compu. Sci. 21, 3, 137--140.
|
| |
34
|
McKay, B. 1981. Practical graph isomorphism. Congressus Numerantium 30, 45--87.
|
| |
35
|
McKay, B. 2004. No automorphisms, yes? http://cs.anu.edu.au/~bdm/nauty/.
|
| |
36
|
Miyazaki, T. 1997. The complexity of McKay's canonical labeling algorithm. Groups and Computation II, DIMACS Ser. Discrete Math. Theoret. Comput. Sci. 28, 239--256.
|
| |
37
|
Moock, T. E., Nourse, J. G., Grier, D., and Hounshell, W. D. 1988. The implementation of atom-atom mapping and related features in the reaction access system (reaccs). In Chemical structures, W. A. Warr, Ed. Springer Verlag, 303--313.
|
| |
38
|
Morgan, H. L. 1965. The generation of a unique machine description for chemical structures-a technique developed at chemical abstracts service. J. Chem. Doc. 5, 2, 107--113.
|
| |
39
|
|
| |
40
|
Raymond, J. W., Gardiner, E. J., and Willett, P. 2002. Rascal: Calculation of graph similarity using maximum common edge subgraphs. The Computer Journal 45, 6, 631--644.
|
| |
41
|
Raymond, J. W. and Willett, P. 2002. Maximum common subgraph isomorphism algorithms for the matching of chemical structures. Journal of Computer-Aided Molecular Design 16, 7 (July), 521--533.
|
| |
42
|
Ugi, I., Bauer, J., Blomberger, C., Brandt, J., Dietz, A., Fontain, E., Gruber, B., Scholley-Pfab, A., Senff, A., and Stein, N. 1994. Models, concepts, theories, and formal languages in chemistry and their use as a basis for computer assistance in chemistry. J. Chem. Inf. Compu. Sci. 34, 1, 3--16.
|
| |
43
|
Vleduts, G. E. 1963. Concerning one system of classification and codification of organic reactions. Information Storage and Retrieval, 117--146.
|
|