ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Discovering biological motifs with genetic programming
Full text PdfPdf (208 KB)
Source Genetic And Evolutionary Computation Conference archive
Proceedings of the 2005 conference on Genetic and evolutionary computation table of contents
Washington DC, USA
SESSION: Biological applications table of contents
Pages: 401 - 408  
Year of Publication: 2005
ISBN:1-59593-010-8
Authors
Rolv Seehuus  Norwegian University of Science and Technology, Trondheim, NORWAY
Amund Tveit  Norwegian University of Science and Technology, Trondheim, NORWAY
Ole Edsberg  Norwegian University of Science and Technology, Trondheim, NORWAY
Sponsors
SIGEVO: ACM Special Interest Group on Genetic and Evolutionary Computation
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 5,   Downloads (12 Months): 26,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1068009.1068074
What is a DOI?

ABSTRACT

Choosing the right representation for a problem is important. In this article we introduce a linear genetic programming approach for motif discovery in protein families, and we also present a thorough comparison between our approach and Koza-style genetic programming using ADFs. In a study of 45 protein families, we demonstrate that our algorithm, given equal processing resources and no prior knowledge in shaping of datasets, consistently generates motifs that are of significantly better quality than those we found by using trees as representation. For several of the studied protein families we evolve motifs comparable to those found in Prosite, a manually curated database of protein motifs.Our linear genome gave better results than Koza-style genetic programming for 37 of 45 families. The difference is statistically significant for 24 of the families at the 99% confidence level.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Boeckmann B., BAiroch A., Apweiler R., Blatter M, Estreicher A., Gasteiger E., Martin M, Michoud K., O'Donovan C., Phan I., Pilbout S., and Schneider M. The swiss-prot protein knowledgebase and its supplement trembl in 2003. Nucleic Acids Research, 31:365--370, 2003.
 
3
 
4
A. Brazma, I. Jonassen, I. Eidhammer, and D. Gilbert. Approaches to the automatic discovery of patterns in biosequences. Journal of Computational Biology, 5(2):277--304, 1998.
 
5
Jason M. Daida and Adam M. Hilss. Identifying structural mechanisms in standard genetic programming. In E. Cantu-Paz, J. A. Foster, K. Deb, D. Davis, R. Roy, U.-M. O'Reilly, H.-G. Beyer, R. Standish, G. Kendall, S. Wilson, M. Harman, J. Wegener, D. Dasgupta, M. A. Potter, A. C. Schultz, K. Dowsland, N. Jonoska, and J. Miller, editors, Genetic and Evolutionary Computation - GECCO-2003, volume 2724 of LNCS, pages 1639--1651, Chicago, 12-16 July 2003. Springer-Verlag.
 
6
Yoav Freund and Robert E. Schapire. Experiments with a new boosting algorithm. In International Conference on Machine Learning, pages 148--156, 1996.
 
7
Larry Gonick and Woollcott Smith. Cartoon Guide to Statistics, chapter 9. HarperPerennial, 1993.
 
8
Yuh-Jyh Hu. Biopattern discovery by genetic programming. In John R. Koza, Wolfgang Banzhaf, Kumar Chellapilla, Kalyanmoy Deb, Marco Dorigo, David B. Fogel, Max H. Garzon, David E. Goldberg, Hitoshi Iba, and Rick Riolo, editors, Genetic Programming 1998: Proceedings of the Third Annual Conference, pages 152-157, University of Wisconsin, Madison, Wisconsin, USA, 22--25 July 1998. Morgan Kaufmann.
 
9
Nicolas Hulo, Christian J. A. Sigrist, Virginie Le Saux, Petra S. Langendijk-Genevaux, Lorenza Bordoli, Alexandre Gattiker, Edouard De Castro, Philipp Bucher, and Amos Bairoch. Recent improvements to the PROSITE database. Nucl. Acids Res., 32(90001):D134--137, 2004.
 
10
John R. Koza and David Andre. Automatic discovery using genetic programming of an unknown-sized detector of protein motifs containing repeatedly-used subexpressions. In Justinian P. Rosca, editor, Proceedings of the Workshop on Genetic Programming: From Theory to Real-World Applications, pages 89--97, Tahoe City, California, USA, 9 July 1995.
 
11
John R. Koza and David Andre. Automatic discovery of protein motifs using genetic programming. In Xin Yao, editor, Evolutionary Computation: Theory and Applications. World Scientific, Singapore, 1996. In Press 1997?
 
12
Bjorn Olsson. Using evolutionary algorithms in the design of protein fingerprints. In Wolfgang Banzhaf, Jason Daida, Agoston E. Eiben, Max H. Garzon, Vasant Honavar, Mark Jakiela, and Robert E. Smith, editors, Proceedings of the Genetic and Evolutionary Computation Conference, volume 2, pages 1636-1642, Orlando, Florida, USA, 13-17 July 1999. Morgan Kaufmann.
 
13
 
14
I. Rigoutsos, A. Floratos, L. Parida, Y. Gao, and D. Platt. The emergence of pattern discovery techniques in computational biolog. Metabolic Engineering, 2:159--177, 2000.
 
15
Brian J. Ross. Probabilistic pattern matching and the evolution of stochastic regular expressions. In Scott Brave and Annie S. Wu, editors, Late Breaking Papers at the 1999 Genetic and Evolutionary Computation Conference, pages 229--237, Orlando, Florida, USA, 13 July 1999.
 
16
 
17
Brian J. Ross. The evaluation of a stochastic regular motif language for protein sequences. In Lee Spector, Erik D. Goodman, Annie Wu, W. B. Langdon, Hans-Michael Voigt, Mitsuo Gen, Sandip Sen, Marco Dorigo, Shahram Pezeshk, Max H. Garzon, and Edmund Burke, editors, Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2001), pages 120--128, San Francisco, California, USA, 7-11 July 2001. Morgan Kaufmann.
 
18
 
19
 
20

Collaborative Colleagues:
Rolv Seehuus: colleagues
Amund Tveit: colleagues
Ole Edsberg: colleagues