ACM Home Page
Please provide us with feedback. Feedback
Digital Library logoTake a look at the new version of this page: [ beta version ]. Tell us what you think.
Discriminative motifs
Full text PdfPdf (1.52 MB)
Source Annual Conference on Research in Computational Molecular Biology archive
Proceedings of the sixth annual international conference on Computational biology table of contents
Washington, DC, USA
Pages: 291 - 298  
Year of Publication: 2002
ISBN:1-58113-498-3
Author
Saurabh Sinha  University of Washington, Seattle, WA
Sponsors
SIGACT: ACM Special Interest Group on Algorithms and Computation Theory
ACM: Association for Computing Machinery
Publisher
ACM  New York, NY, USA
Bibliometrics
Downloads (6 Weeks): 1,   Downloads (12 Months): 15,   Citation Count: 2
Additional Information:

abstract   references   cited by   index terms   collaborative colleagues  

Tools and Actions: Request Permissions Request Permissions    Review this Article  
DOI Bookmark: Use this link to bookmark this Article: http://doi.acm.org/10.1145/565196.565234
What is a DOI?

Warning: The download time has expired please click on the item to try again.


ABSTRACT

This paper takes a new view of motif discovery, addressing a common problem in existing motif finders. A motif is treated as a feature of the input promoter regions that leads to a good classifier between these promoters and a set of background promoters. This perspective allows us to adapt existing methods of feature selection, a well studied topic in machine learning, to motif discovery. We develop a general algorithmic framework that can be specialized to work with a wide variety of motif models, including consensus models with degenerate symbols or mismatches, and composite motifs. A key feature of our algorithm is that it measures over-representation while maintaining information about the distribution of motif instances in individual promoters. The assessment of a motif's discriminative power is normalized against chance behaviour by a probabilistic analysis. We apply our framework to two popular motif models, and are able to detect several known binding sites in sets of co-regulated genes in yeast.


REFERENCES

Note: OCR errors may be found in this Reference List extracted from the full text article. ACM has opted to expose the complete List rather than only correct and linked references.

 
1
M. I. Arnone and E. H. Davidson. The hardwiring of development: organization and function of genomic regulatory systems. Development, 124:1851--1864, 1997.
 
2
 
3
A. Ben-Dor, L. Bruhn, N. Friedman, I. Nachman, M. Schummer, and Z. Yakhini. Tissue classification with gene expression profiles. Journal of Computational Biology, 7:559--584, 2000.
 
4
W. N. Grundy, T. L. Bailey, C. P. Elkan, and M. E. Baker. Meta-meme: Motif-based hidden markov models of protein families. Computer Applications in the Biosciences, 13(4):397--406, 1997.
 
5
D. GuhaThakurta and G. D. Stormo. Identifying target sites for cooperatively binding factors. In RECOMB01: Proceedings of the Fifth Annual International Conference on Computational Molecular Biology, Montreal, Canada, Apr. 2001.
 
6
G. Z. Hertz and G. D. Stormo. Identification of consensus patterns in unaligned DNA and protein sequences: a large-deviation statistical basis for penalizing gaps. In H. A. Lim and C. R. Cantor, editors, Proceedings of the Third International Conference on Bioinformatics and Genome Research, pages 201--216. World Scientific Publishing Co., Ltd., Singapore, 1995.
 
7
Y.-J. Hu, S. Sandmeyer, C. McLaughlin, and D. Kibler. Combinatorial motif analysis and hypothesis generation on a genomic scale. Bioinformatics, 16(3):222--232, 2000.
 
8
C. E. Lawrence, S. F. Altschul, M. S. Boguski, J. S. Liu, A. F. Neuwald, and J. C. Wootton. Detecting subtle sequence signals: a Gibbs sampling strategy for multiple alignment. Science, 262:208--214, 8 October 1993.
9
 
10
P. Nicodème, B. Salvy, and P. Flajolet. Motif statistics. Technical Report RR-3606, INRIA Rocquencourt, Jan. 1999.
 
11
Y. Ohmori, R. D. Schreiber, and T. A. Hamilton. Synergy between interferon-gamma and tumor necrosis factor alpha in transcriptional activation is mediated by cooperation between signal transducer and activator of transcription 1 and nuclear factor kappa b. The Journal of Biological Chemistry, pages 14899--14907, 1997.
 
12
P. Pavlidis, T. Furey, M. Liberto, D. Haussler, and W. Grundy. Promoter region-based classification of genes. Pacific Symposium on Biocomputing, 2000.
 
13
F. P. Roth, J. D. Hughes, P. W. Estep, and G. M. Church. Finding DNA regulatory motifs within unaligned noncoding sequences clustered by whole-genome {mRNA} quantitation. Nature Biotechnology, 16:939--945, Oct. 1998.
 
14
 
15
 
16
J. van Helden, B. André, and J. Collado-Vides. Extracting regulatory sites from the upstream region of yeast genes by computational analysis of oligonucleotide frequencies. Journal of Molecular Biology, 281(5):827--842, Sept. 4 1998.
 
17
A. Wagner. Genes regulated cooperatively by one or more transcription factors and their identification in whole eukaryotic genomes. Bioinformatics, 15(10):776--784, 1999.
 
18
M. S. Waterman. Introduction to Computational Biology. Chapman & Hall, 1995.
 
19
J. Zhu and M. Q. Zhang. SCPD: a promoter database of the yeast Saccharomyces cerevisiae. Bioinformatics, 15(7/8):563--577, July/August 1999. http://cgsigma.cshl.org/jian/.