|
ABSTRACT
Biological databases contain a wide variety of data types, often with rich relational structure. Consequently multi-relational data mining techniques frequently are applied to biological data. This paper presents several applications of multi-relational data mining to biological data, taking care to cover a broad range of multi-relational data mining techniques.
REFERENCES
Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.
| |
1
|
Critical assessment of information extraction systems in biology, 2003. www.pdg.cnb.uam.es/BioLink/BioCreative.eval.html.
|
| |
2
|
A. Bernal, U. Ear, and N. Kyrpides. Genomes OnLline database (GOLD): A monitor of genome projects worldwide. Nucleic Acids Research, 29(1):126--127, 2001.
|
| |
3
|
J. Bockhorst, M. Craven, D. Page, J. Shavlik, and J. Glasner. A Bayesian network approach to operon prediction. Bioinformatics, 19(10):1227--1235, 2003.
|
| |
4
|
J. Bockhorst, Y. Qiu, J. Glasner, M. Liu, F. Blattner, and M. Craven. Predicting bacterial transcription units using sequence and expression data. Bioinformatics, 19(Suppl. 1):34--43, 2003.
|
| |
5
|
C. Bryant, S. Muggleton, S. Oliver, D. Kell, P. Reiser, and R. King. Combining inductive logic programming, active learning, and robotics to discover the function of genes. Electronic Transactions in Artificial Intelligence, 2001.
|
| |
6
|
R. Bunescu, R. Ge, R. Kate, R. Mooney, E. Marcotte, and A. Ramani. Learning information extractors for proteins and their interactions. In Working Notes of the ICML Workshop on Machine Learning in Bioinformatics, 2003.
|
| |
7
|
C. Burge and S. Karlin. Prediction of complete gene structures in human genomic DNA. Journal of Molecular Biology, 268:78--94, 1997.
|
| |
8
|
|
 |
9
|
Jie Cheng , Christos Hatzis , Hisashi Hayashi , Mark-A. Krogel , Shinichi Morishita , David Page , Jun Sese, KDD Cup 2001 report, ACM SIGKDD Explorations Newsletter, v.3 n.2, January 2002
[doi> 10.1145/507515.507523]
|
| |
10
|
L. Chrisman, P. Langley, S. Bay, and A. Pohorille. Incorporating biological knowledge into evaluation of causal regulatory hypotheses. In Proceedings of the Eighth Pacific Symposium on Biocomputing. 2003.
|
 |
11
|
|
| |
12
|
|
| |
13
|
|
| |
14
|
|
| |
15
|
A. Debnath, R. L. de Compadre, G. Debnath, A. Schusterman, and C. Hansch. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity. Journal of Medicinal Chemistry, 34(2):786--797, 1991.
|
| |
16
|
L. Dehaspe, H. Toivonen, and R. King. Finding frequent substructures in chemical compounds. In R. Agrawal, P. Stolorz, and G. Piatetsky-Shapiro, editors, Proceedings of the Fourth International Conference on Knowledge Discovery and Data Mining (KDD-98). AAAI Press, New York, 1998.
|
| |
17
|
|
| |
18
|
Saso Dzeroski , Hendrik Blockeel , Boris Kompare , Stefan Kramer , Bernhard Pfahringer , Wim Van Laer, Experiments in Predicting Biodegradability, Proceedings of the 9th International Workshop on Inductive Logic Programming, p.80-91, June 24-27, 1999
|
| |
19
|
|
| |
20
|
|
| |
21
|
|
| |
22
|
C. Helma and S. Kramer. A survey of the predictive toxicology challenge 2000--2001. Bioinformatics, 19( 10):1179--1182, 2003.
|
| |
23
|
L. Hirschman, J. Park, J. Tsujii, L. Wong, and C. Wu. Accomplishments and challenges in literature data mining for biology. Bioinformatics, 18:1553--1561, 2002.
|
| |
24
|
L. Hood and D. Galas. The digital code of DNA. Nature, 421:444--448, 2003.
|
| |
25
|
|
| |
26
|
A. Jain, T. Dietterich, R. Lathrop, D. Chapman, R. Critchlow, B. Bauer, T. Webster, and T. Lozano-Pérez. Compass: a shape-based machine learning tool for drug design. Journal of Computer-Aided Molecular Design, 8:635--652, 1994.
|
| |
27
|
A. Jain, K. Koile, B. Bauer, and D. Chapman. Compass: Predicting biological activities from molecular surface properties. Journal of Medicinal Chemistry, 37:2315--2327. 1994.
|
| |
28
|
P. Karp, M. Riley, S. Paley, and A. Pellegrini-Toole. EcoCyc: Electronic encyclopedia of E. coli genes and metabolism. Nucleic Acids Research, 25(l), 1997.
|
| |
29
|
R. King, S. Muggleton, R. Lewis, and M. Sternberg. Drug design by machine learning: The use of inductive logic programming to model the structure-activity relationships of trimethoprim analogues binding to dihydrofolate reductase. Proceedings of the National Academy of Sciences, 89(23):11322--11326, 1992.
|
| |
30
|
R. King, S. Muggleton, A. Srinivasan, and M. Sternberg. Structure-activity relationships derived by machine learning: the use of atoms and their bond connectives to predict mutagenicity by inductive logic programming. Proceedings of the National Academy of Sciences, 93:438--442, 1996.
|
| |
31
|
I. Korf, P. Flicek, D. Duan, and M. Brent. Integrating genomic homology into gene structure prediction. Bioinformatics, l7(Suppl. l):S140--S148, 2001.
|
 |
32
|
|
| |
33
|
C. Lawrence, S. Altschul, M. Boguski, J. Liu, A. Neuwald, and J. Wootton. Detecting subtle sequence signals: A Gibbs sampling strategy for multiple alignment. Science, 262:208--214, 1993.
|
| |
34
|
N. Marchand-Geneste, K. Watson, B. Alsberg, and R. King. A new approach to pharmacophore mapping and qsar analysis using inductive logic programming. application to thermolysin inhibitors and glycogen phosphorylase b inhibitors. Journal of Medicinal Chemistry, 45(2):399--409, January 2002.
|
| |
35
|
I. Meyer and R. Durbin. Comparative ab initio prediction of gene structures using pair HMMs. Bioinformatics, 18(10):1309--1318, 2002.
|
| |
36
|
|
| |
37
|
|
| |
38
|
S. Muggleton. Inverse entailment and Progol. New Generation Computing, 13:245--286, 1995.
|
| |
39
|
S. Muggleton and C. Feng. Efficient induction of logic programs. In Proceedings of the First Conference on Algorithmic Learning Theory, Tokyo, 1990. Ohmsha.
|
| |
40
|
S. Muggleton, R. King, and M. Sternberg. Protein secondary structure prediction using logic-based machine learning. Protein Engineering, 5(7):647--657, 1992.
|
| |
41
|
National Library of Medicine. Pubmed, 1999. http://www.ncbi.nlm.nih.gov/PubMed/.
|
 |
42
|
|
| |
43
|
|
| |
44
|
|
| |
45
|
S. Ray and M. Craven. Representing sentence structure in hidden Markov models for information extraction. In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, pages 1273--1279, Seattle, WA, 2001. Morgan Kaufmann.
|
| |
46
|
M. Rebhan, V. Chalifa-Caspi, J. Prilusky, and D. Lancet. Genecards: Encyclopedia for genes, proteins and diseases, 1997. http://bighost.area.ba.cnr.it/GeneCards.
|
| |
47
|
P. Reiser, R. King, D. Kell, S. Muggleton, C. Bryant, and S. Oliver. Developing a logical model of yeast metabolism. Electronic Transactions in Artificial Intelligence, 2001.
|
| |
48
|
|
| |
49
|
E. Riloff. The sundance sentence analyzer, 1998. http://www.cs.utah.edu/projects/nlp/.
|
| |
50
|
B. Rost and C. Sander. Combining evolutionary information and neural networks to predict protein secondary structure. Proteins, 19:55--77, 1994.
|
| |
51
|
S. Schmidler, J. Liu, and D. Brutlag. Bayesian segmentation of protein secondary structure. Journal of Computational Biology, 7:233--248, 2000.
|
| |
52
|
E. Segal, B. Taskar, A. Gasch, N. Friedman, and D. Koller. Rich probabilistic models for gene expression. Bioinforrnatics, 1:l--10, 2001.
|
| |
53
|
Hagit Shatkay , Stephen Edwards , W. John Wilbur , Mark Boguski, Genes, Themes, and Microarrays: Using Information Retrieval for Large-Scale Gene Analysis, Proceedings of the Eighth International Conference on Intelligent Systems for Molecular Biology, p.317-328, August 19-23, 2000
|
| |
54
|
M. Skounakis, M. Craven, and S. Ray. Hierarchical hidden Markov models for information extraction. In Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence, Acapulco, Mexico, 2003. Morgan Kaufmann.
|
| |
55
|
|
| |
56
|
M. Turcotte, S. Muggleton, and M. Sternberg. Automated discovery of structural signatures of protein fold and function. Journal of Molecular Biology, 306:591--605, 2001.
|
 |
57
|
|
|