ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Creating regular expressions as mRNA motifs with GP to predict human exon splitting
Full text PdfPdf (413 KB)
Source
Genetic And Evolutionary Computation Conference archive
Proceedings of the 11th Annual conference on Genetic and evolutionary computation table of contents
Montreal, Québec, Canada
POSTER SESSION: Track 3: bioinformatics and computational biology table of contents
Pages: 1789-1790  
Year of Publication: 2009
ISBN:978-1-60558-325-9
Authors
Wiliam B. Langdon  King's College, London, London, Bahamas
J. Rowsell  Essex University, CO4 3SQ, Gt Britain
A. P. Harrison  Essex University, CO4 3SQ, AA, Azerbaijani
Sponsors
SIGEVO: ACM Special Interest Group on Genetic and Evolutionary Computation
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 8,   Downloads (12 Months): 41,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/1569901.1570162
What is a DOI?

ABSTRACT

RNAnet [3] http://bioinformatics.essex.ac.uk/users/wlangdon/rnanet/ allows the user to calculate correlations of gene expression, both between genes and between components within genes. We investigate all of Ensembl http://www.ensembl.org and find all the Homo Sapiens exons for which there are sufficient robust Affymetrix HG-U133 Plus 2 GeneChip probes. Calculating correlation between mRNA probe measurements for the same exon shows many exons whose components are consistently up regulated and down regulated. However we identify other Ensembl exons where sub-regions within them are self consistent but these transcript blocks are not well correlated with other blocks in the same exon. We suggest many current Ensembl exon definitions are incomplete. Secondly, having identified exon with substructure we use machine learning to try and identify patterns in the DNA sequence lying between blocks of high correlation which might yield biological or technological explanations. A Backus-Naur form (BNF) context-free grammar constrains strongly typed genetic programming (STGP) to evolve biological motifs in the form of regular expressions (RE) (e.g. TCTTT) which classify gene exons with potential alternative mRNA expression from those without. We show biological patterns can be data mined by a GP written in gawk and using egrep from NCBI's GEO http://www.ncbi.nlm.nih.gov/geo/ database. The automatically produced DNA motifs suggest that alternative polyadenylation is not responsible. (Full version in TR-09-02 [7].) Blocky exons can be found in http://bioinformatics.essex.ac.uk/users/wlangdon/tr-09-02.tar.gz


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
 
2
Langdon, W. B. Evolving GeneChip correlation predictors on parallel graphics hardware. In 2008 IEEE World Congress on Computational Intelligence (Hong Kong, 1-6 June 2008), J. Wang, Ed., IEEE Computational Intelligence Society, IEEE Press, pp. 4152--4157.
 
3
Langdon, W. B. A map of human gene expression. Tech. Rep. CES-486, Departments of Mathematical, Biological Sciences and Computing and Electronic Systems, University of Essex, Colchester, CO4 3SQ, UK, July 2008.
 
4
Langdon, W. B., and Harrison, A. P. Evolving DNA motifs to predict GeneChip probe performance. Algorithms in Molecular Biology. In press.
 
5
Langdon, W. B., McKay, R. I., and Spector, L. Genetic programming. In Handbook of Metaheuristics, J.-Y. Potvin and M. Gendreau, Eds., second ed. Springer, ch. 7.
 
6
 
7
Creating regular expressions as mRNA motifs with GP to predict human exon splitting. Tech. Rep. TR-09-02, Department of Computer Science, Crest Centre, King's College, London, Strand, London, WC2R 2LS, UK, 19 Mar. 2009.
 
8
Langdon, W. B., Upton, G. J. G., da Silva Camargo, R., and Harrison, A. P. A survey of spatial defects in Homo Sapiens Affymetrix GeneChips. IEEE/ACM Transactions on Computational Biology and Bioinformatics (2009). In press.
 
9
Poli, R., Langdon, W. B., and McPhee, N. F. A field guide to genetic programming. Published via http://lulu.com and freely available at http://www.gp-field-guide.org.uk, 2008. (With contributions by J. R. Koza).
 
10
Retelska, D., et al. Similarities and differences of polyadenylation signals in human and fly. BMC Genomics 7, 1 (2006), 176.
 
11
Sanchez-Graillet, O., Rowsell, J., Langdon, W. B., Stalteri, M. A., Arteaga Salas, J. M., Upton, G. J., and Harrison, A. P. Widespread existence of uncorrelated probe intensities from within the same probeset on Affymetrix GeneChips. Journal of Integrative Bioinformatics 5, 2 (2008), 98.

Collaborative Colleagues:
Wiliam B. Langdon: colleagues
J. Rowsell: colleagues
A. P. Harrison: colleagues