ACM Home Page
Please provide us with feedback. Feedback
Searching Genomes for Noncoding RNA Using FastR
Full text PdfPdf (1.55 MB)
Source IEEE/ACM Transactions on Computational Biology and Bioinformatics (TCBB) archive
Volume 2 ,  Issue 4  (October 2005) table of contents
Pages: 366 - 379  
Year of Publication: 2005
ISSN:1545-5963
Authors
Publisher
IEEE Computer Society Press  Los Alamitos, CA, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 51,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Review this Article  
DOI Bookmark: 10.1109/TCBB.2005.57

ABSTRACT

The discovery of novel noncoding RNAs has been among the most exciting recent developments in biology. It has been hypothesized that there is, in fact, an abundance of functional noncoding RNAs (ncRNAs) with various catalytic and regulatory functions. However, the inherent signal for ncRNA is weaker than the signal for protein coding genes, making these harder to identify. We consider the following problem: Given an RNA sequence with a known secondary structure, efficiently detect all structural homologs in a genomic database by computing the sequence and structure similarity to the query. Our approach, based on structural filters that eliminate a large portion of the database while retaining the true homologs, allows us to search a typical bacterial genome in minutes on a standard PC. The results are two orders of magnitude better than the currently available software for the problem. We applied FastR to the discovery of novel riboswitches, which are a class of RNA domains found in the untranslated regions. They are of interest because they regulate metabolite synthesis by directly binding metabolites. We searched all available eubacterial and archaeal genomes for riboswitches from purine, lysine, thiamin, and riboflavin subfamilies. Our results point to a number of novel candidates for each of these subfamilies and include genomes that were not known to contain riboswitches.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
[1] S.F. Altschul, W. Gish, W. Miller, E.W. Myers, and D.J. Lipman, "Basic Local Alignment Search Tool," J. Molecular Biology, vol. 215, pp. 403-410, 1990.
 
2
[2] L. Argaman et al., "Novel Small RNA-Encoding Genes in the Ontergenic Regions of Escherischia Coli," Current Biology, vol. 11, pp. 941-950, 2001.
 
3
[3] V. Bafna, S. Muthukrishnan, and R. Ravi, "Computing Similarity between RNA Strings," Combinatorial Pattern Matching Conf., vol. 937, pp. 1-14, 1995.
 
4
[4] J.-H. Chen, S.-Y. Lee, and B. Shapiro, "A Computational Procedure for Assessing the Significance of RNA Secondary Structure," Computer Applications in the Biosciences, vol. 6, pp. 7-18, 1990.
 
5
[5] A. Coventry, D.J. Kleitman, and B. Berger, "MSARI: Multiple Sequence Alignments for Statistical Detection of RNA Secondary Structure," Proc. Nat'l Academy of Sciences, vol. 101, no. 33, pp. 12102-12107, 2004.
 
6
[6] D. di Bernardo, T. Down, and T. Hubbard, "ddbRNA: Detection of Conserved Secondary Structures in Multiple Alignments," Bioinformatics , vol. 19, no. 13, pp. 1606-1611, 2003.
 
7
[7] M. Dsouza, N. Larsen, and R. Overbeek, "Searching for Patterns in Genomic Data," Trends in Genetics, vol. 13, no. 12, pp. 497-498, 1997.
 
8
[8] R. Durbin, S. Eddy, A. Krogh, and G. Mitchison, "Covariance Models: SCFG-Based RNA Profiles," Biological Sequence Analysis, chapter 10.3, Cambridge Univ. Press, 1998.
 
9
[9] S.R. Eddy, "Non-Coding RNA Genes and the Modern RNA World," Nature Rev. in Genetics, vol. 2, pp. 919-929, 2001.
 
10
[10] S.R. Eddy and R. Durbin, "RNA Sequence Analysis Using Covariance Models," Nucleic Acids Research, vol. 22, pp. 2079- 2088, 1994.
 
11
[11] D. Gautheret and A. Lambert, "Direct RNA Motif Definition and Identification from Multiple Sequence Alignments Using Secondary Structure Profiles," J. Molecular Biology, vol. 313, no. 5, pp. 1003-1011, 2001.
 
12
[12] S. Griffiths-Jones, A. Bateman, M. Marshall, A. Khanna, and S.R. Eddy, "Rfam: An RNA Family Database," Nucleic Acids Research, vol. 31, no. 1, pp. 439-441, 2003.
 
13
 
14
 
15
 
16
[16] F. Jacob and J. Monod, "Genetic Regulatory Mechanisms in the Synthesis of Proteins," J. Molecular Biology, vol. 3, pp. 318-356, 1961.
 
17
[17] J. Jaeger, D.H. Turner, and M. Zuker, "Improved Prediction of Secondary Structures for RNA," Proc. Nat'l Academy of Sciences, vol. 86, pp. 7706-7710, 1989.
 
18
[18] T. Jiang, G. Lin, B. Ma, and K. Zhang, "A General Edit Distance between RNA Structures," J. Computational Biology, vol. 9, pp. 371- 388, 2002.
 
19
[19] R.J. Klein and S.R. Eddy, "Rsearch: Finding Homologs of Single Structured RNA Sequences," BMC Bioinformatics, vol. 4, no. 1, p. 44, 2003.
 
20
[20] A. Lambert et al., "The ERPIN Server: An Interface to Profile-Based RNA Motif Identification," Nucleic Acids Research, vol. 32, no. s2, pp. W160-165, 2004.
 
21
[21] E. Lander et al., "Initial Sequencing and Analysis of the Human Genome," Nature, vol. 409, pp. 860-921, 2001.
 
22
[22] S.Y. Le, J.H. Chen, and J. Maizel, Structure and Methods: Human Genome Initiative and DNA Recombination, vol. 1, pp. 127-136. Adenine Press, 1990.
 
23
[23] R.C. Lee and V. Ambros, "An Extensive Class of Small RNAs in Caenorhabditis elegans," Science, vol. 294, pp. 862-864, 2001.
 
24
[24] H.P. Lenhof, K. Reinert, and M. Vingron, "A Polyhedral Approach to RNA Sequence Structure Alignment," J. Computational Biology, vol. 5, no. 3, pp. 517-530, 1998.
 
25
[25] L.P. Lim, N.C. Lau, E.G. Weinstein, A. Abdelhakim, S. Yekta, M.W. Rhoades, C.B. Burge, and D.P. Bartel, "The MicroRNAs of Caenorhabditis elegans," Genes and Developtment, vol. 17, pp. 991- 1008, 2003.
 
26
[26] T.R. Lowe and S.R. Eddy, "tRNAscan-SE: A Program for Improved Detection of Transfer RNA Genes in Genomic Sequence," Nucleic Acids Research, vol. 25, pp. 955-964, 1997.
 
27
[27] D.H. Mathews and D.H. Turner, "Dynalign: An Algorithm for Finding the Secondary Structure Common to Two RNA Sequences," J. Molecular Biology, vol. 317, no. 2, pp. 191-203, 2002.
 
28
[28] J.P. McCutcheon and S.R. Eddy, "Computational Identification of Non-Coding RNAs in Saccharomyces cerevisiae by Comparative Genomics," Nucleic Acids Research, vol. 31, no. 14, pp. 4119-4128, 2003.
 
29
[29] A. Nahvi, N. Sudarshan, M.S. Ebert, X. Zou, K.L. Brown, and R.R. Breaker, "Genetic Control by a Metabolite Binding mRNA," Chemical Biology, vol. 9, pp. 1043-1049, 2003.
 
30
[30] C.D. Novina and P.A. Sharp, "The RNAi Revolution," Nature, vol. 430, no. 6996, pp. 161-164, 2004.
 
31
[31] E. Rivas and S.R. Eddy, "Secondary Structure Alone Is Generally Not Statistically Significant for the Detection of Noncoding RNAs," Bioinformatics, vol. 16, no. 7, pp. 583-605, 2000.
 
32
[32] E. Rivas and S.R. Eddy, "Noncoding RNA Gene Detection Using Comparative Sequence Analysis," BMC Bioinformatics, vol. 2, pp. 8- 26, 2001.
 
33
[33] E. Rivas, R.J. Klein, T.A. Jones, and S.R. Eddy, "Computational Identification of Noncoding RNAs in E. coli by Comparative Genomics," Current Biology, vol. 11, pp. 1369-1373, 2001.
 
34
[34] D.A. Rodinov, A.G. Vitreschak, A.A. Mironov, and M.S. Gelfand, "Regulation of Lysine Biosynthesis and Transport Genes in Bacteria: Yet Another RNA Riboswitch?" Nucleic Acids Research, vol. 31, no. 23, pp. 6748-6757, 2003.
 
35
 
36
[36] D. Sankoff, "Simulations Solution of the RNA Folding, Alignment and Protosequence Problems," SIAM J. Applied Math., vol. 45, no. 5, pp. 810-825, 1985.
 
37
[37] M. Szymanski, M.Z. Barciszewska, V.A. Erdmann, and J. Barciszewski, "5S Ribosomal RNA Database," Nucleic Acids Research, vol. 28, no. 1, pp. 166-167, 2002.
 
38
[38] J.C. Venter et al. "The Sequence of the Human Genome," Science, vol. 291, no. 5507, pp. 1304-1351, 2001.
 
39
[39] A.G. Vitreschak et al. "Riboswitches: The Oldest Mechanism for the Regulation of Gene Expression?" Trends in Genetics, vol. 20, no. 1, pp. 44-50, 2003.
 
40
[40] S. Washietl and I.L. Hofacker, "Consensus Folding of Aligned Sequences as a New Measure for the Detection of Functional RNAs by Comparative Genomics," J. Molecular Biology, vol. 342, no. 1, pp. 19-30, 2004.
 
41
[41] R.H. Waterson et al. "Initial Sequencing and Comparative Analysis of the Mouse Genome," Nature, vol. 420, no. 6915, pp. 520-562, 2002.
42
 
43
[43] W.C. Winkler and R.R. Breaker, "Genetic Control by Metabolite-Binding Riboswitches," Chembiochem, vol. 4, no. 10, pp. 1024-1032, 2003.
 
44
[44] C. Workman and A. Krogh, "No Evidence that mRNA have Lower Folding Free Energy than Random Sequences with the same Dinucleotide Distribution," Nucleic Acids Research, vol. 27, no. 24, pp. 4816-4822, 1999.
 
45
 
46
[46] M. Zuker and D. Sankoff, "RNA Secondary Structures and their Prediction," Bull. Math. Biology, vol. 46, pp. 591-621, 1984.

Collaborative Colleagues:
Shaojie Zhang: colleagues
Brian Haas: colleagues
Eleazar Eskin: colleagues
Vineet Bafna: colleagues