ACM Home Page
Please provide us with feedback. Feedback
Guiding motif discovery by iterative pattern refinement
Full text PdfPdf (191 KB)
Source Symposium on Applied Computing archive
Proceedings of the 2004 ACM symposium on Applied computing table of contents
Nicosia, Cyprus
SESSION: Bioinformatics (BIO) table of contents
Pages: 162 - 166  
Year of Publication: 2004
ISBN:1-58113-812-1
Authors
Zhiping Wang  Indiana University, Bloomington, IN
Mehmet Dalkilic  Indiana University, Bloomington, IN
Sun Kim  Indiana University, Bloomington, IN
Sponsor
SIGAPP: ACM Special Interest Group on Applied Computing
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 7,   Downloads (12 Months): 22,   Citation Count: 0
Additional Information:

abstract   references   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/967900.967934
What is a DOI?

ABSTRACT

In this paper, we demonstrate that the performance of a motif discovery algorithm can be significantly improved by embedding it into a novel framework that effectively guides the motif discovery process. The framework is also general enough to allow any statistical motif discovery algorithm to be used. Motivation for this research comes from the fact that the statistical significance of patterns depends on the background probability which is largely determined by input sequences. Our framework guides motif discovery by inputting subsequences to an existing motif discovery algorithm, rather than using entire sequences. Subsequences are determined by motifs discovered using existing motif discovery and search algorithms. Then this technique is iteratively applied until convergence. A starting set of patterns is discovered by a simple, but effective pattern set generation algorithm. Our framework was implemented using MEME and MAST and tested with 108 PROSITE patterns. The result demonstrates that our framework significantly improves the performance of MEME.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
Timothy L. Bailey and Charles Elkan, "Fitting a mixture model by expectation maximization to discover motifs in biopolymers", Second International Conference on Intelligent Systems for Molecular Biology (ISMB), pp. 28--36, 1994.
 
2
Timothy L. Bailey and Michael Gribskov, "Combining evidence using p-values: application to sequence homology searches", Bioinformatics, Vol. 14, pp. 48--54, 1998.
 
3
Sigrist C.J., Cerutti L., Hulo N., Gattiker A., Falquet L., Pagni M., Bairoch A., Bucher P, "PROSITE: a documented database using patterns and profiles as motif descriptors", Brief Bioinform, 3:265--274, 2002.
 
4
Falquet L., Pagni M., Bucher P., Hulo N., Sigrist C.J, Hofmann K., Bairoch A, "The PROSITE database, its status in 2002", Nucleic Acids Res., 30:235--238, 2002.
 
5
Irfan Gunduz, Sihui Zhao, Mehmet Dalkilic and Sun Kim, "Motif Discovery from Large Number of Sequences: A Case Study with Disease Resistance Genes in Arabidopsis thaliana", The 2003 International Conference on Mathematics and Engineering Techniques in Medicine and Biological Sciences (METMBS'03), pp 29--34, 2003.
 
6
<u>http://www.expasy.org/prosite/</u>
 
7
J. Gorodkin, L. J. Heyer, S. Brunak and G. D. Stormo, "Displaying the information contents of structural RNA alignments: the structure logos", Comput. Appl. Biosci., Vol. 13, no. 6 pp 583--586, 1997.
 
8
Thijs G., Lescot M., Marchal K., Rombuats S., De Moor, B. Rouze P. Moreau Y., "A higher order background model imporves the detection of regulatory elements by Gibbs Sampling", Bioinformatic, 17(12) pp. 1113--1127, 2001.
 
9
I. Jonassen, J.F.Collins, D.G.Higgins, "Finding flexible patterns in unaligned protein sequences," Protein Science 4, pp 1587--1595, 1995.
 
10
I. Rigoutsos and A. Floratos, "Combinatorial pattern discovery in biological sequences: The TEIRESIAS algorithm," Bioinformatics 14: pp 55--67, 1998
 
11
Thompson, J.D., Gibson, T.J., Plewniak, F., Jeanmougin, F. and Higgins, D.G. (1997) "The ClustalX windows interface: flexible strategies for multiple sequence alignment aided by quality analysis tools." Nucleic Acids Research, 24:4876--4882.
 
12
<u>http://www.cbs.dtu.dk/-gorodkin/appl/plogo.html</u>

Collaborative Colleagues:
Zhiping Wang: colleagues
Mehmet Dalkilic: colleagues
Sun Kim: colleagues