ACM Home Page
Please provide us with feedback. Feedback
The Applicability of Recurrent Neural Networks for Biological Sequence Analysis
Full text PdfPdf (1.52 MB)
Source IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) archive
Volume 2 ,  Issue 3  (July 2005) table of contents
Pages: 243 - 253  
Year of Publication: 2005
ISSN:1545-5963
Authors
Publisher
IEEE Computer Society Press  Los Alamitos, CA, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 56,   Citation Count: 3
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: 10.1109/TCBB.2005.44

ABSTRACT

Selection of machine learning techniques requires a certain sensitivity to the requirements of the problem. In particular, the problem can be made more tractable by deliberately using algorithms that are biased toward solutions of the requisite kind. In this paper, we argue that recurrent neural networks have a natural bias toward a problem domain of which biological sequence analysis tasks are a subset. We use experiments with synthetic data to illustrate this bias. We then demonstrate that this bias can be exploitable using a data set of protein sequences containing several classes of subcellular localization targeting peptides. The results show that, compared with feed forward, recurrent neural networks will generally perform better on sequence analysis tasks. Furthermore, as the patterns within the sequence become more ambiguous, the choice of specific recurrent architecture becomes more critical.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
S.F. Altschul W. Gish W. Miller E.W. Meyers and D.J. Lipman, “Basic Local Alignment Search Tool,” <i>J. Molecular Biology,</i> vol. 215, no. 3, pp. 403-410, 1990.
 
2
T. Bailey M.E. Baker C.P. Elkan and W.N. Grundy, “MEME, MAST, and Meta-MEME: New Tools for Motif Discovery in Protein Sequences,” <i>Pattern Discovery in Biomolecular Data: Tools, Techniques, and Applications,</i> J.T.L. Wang, B.A. Shapiro, and D. Shasha, eds., pp. 30-54, Oxford Univ. Press, 1999.
 
3
 
4
P. Baldi S. Brunak P. Frasconi G. Soda and G. Pollastri, “Exploiting the Past and the Future in Protein Secondary Structure Prediction,” <i>Bioinformatics,</i> vol. 15, pp. 937-946, 1999.
 
5
M. Christiansen and N. Chater, “Toward a Connectionist Model of Recursion in Human Linguistic Performance,” <i>Cognitive Science,</i> vol. 23, pp. 157-205, 1999.
 
6
J.L. Elman, “Finding Structure in Time,” <i>Cognitive Science,</i> vol. 14, pp. 179-211, 1990.
 
7
O. Emanuelsson, “Predicting Protein Subcellular Localisation from Amino Acid Sequence Information,” <i>Briefings in Bioinformatics,</i> vol. 3, no. 4, pp. 361-376, 2002.
 
8
O. Emanuelsson H. Nielsen S. Brunak and G. von Heijne, “Predicting Subcellular Localization of Proteins Based on Their N-Terminal Amino Acid Sequence,” <i>J. Molecular Biology,</i> vol. 300, no. 4, pp. 1005-1016, 2000.
 
9
 
10
R. Janulczyk and M. Rasmussen, “Improved Pattern for Genome-Based Screening Identifies Novel Cell Wall-Attached Proteins in Gram-Positive Bacteria,” <i>Infection and Immunity,</i> vol. 69, no. 6, pp. 4019-4026, 2001.
 
11
L. Kall A. Krogh and E.L. L. Sonnhammer, “A Combined Transmembrane Topology and Signal Peptide Prediction Method,” <i>J. Molecular Biology,</i> vol. 338, no. 5, pp. 1027-1036, 2004.
 
12
J.F. Kolen, “Recurrent Networks: State Machines or Iterated Function Systems?” <i>Proc. 1993 Connectionist Models Summer School,</i> pp. 203-210, 1994.
 
13
B. Ma J. Tromp and M. Li, “Patternhunter: Faster and More Sensitive Homology Search,” <i>Bioinformatics,</i> vol. 18, pp. 440-445, 2002.
 
14
T.M. Mitchell, “The Need for Biases in Learning Generalisations,” <i>Readings in Machine Learning,</i> J.W. Shavlik and T.G. Dietterich, eds., Morgan Kaufmann, 1980.
 
15
 
16
G. Pollastri D. Przybylski B. Rost and P. Baldi, “Improving the Prediction of Protein Secondary Strucure in Three and Eight Classes Using Recurrent Neural Networks and Profiles,” <i>Proteins,</i> vol. 47, pp. 228-235, 2002.
 
17
T.D. Schneider and R.M. Stephens, “Sequence Logos: A New Way to Display Consensus Sequences,” <i>Nucleic Acids Research,</i> vol. 18, no. 20, pp. 6097-6100, 1990.
 
18
P. Tino M. Cernansky and L. Benuskova, “Markovian Architectural Bias of Recurrent Neural Networks,” <i>IEEE Trans. Neural Networks,</i> vol. 15, no. 1, pp. 6-15, 2004.
 
19
20
 
21
E.J.B. Williams C. Pal and L.D. Hurst, “The Molecular Evolution of Signal Peptides,” <i>Gene,</i> vol. 253, no. 2, pp. 313-322, 2000.


Collaborative Colleagues:
John Hawkins: colleagues
Mikael Boden: colleagues