Nucleic Acids Research, 2002, Vol. 30, No. 24 5549-5560
© 2002 Oxford University Press
Discovery of novel transcription factor binding sites by statistical overrepresentation
Department of Computer Science and Engineering, Box 352350, University of Washington, Seattle, WA 98195-2350, USA
*To whom correspondence should be addressed. Tel: +1 206 543 9263; Fax: +1 206 543 8331; Email: tompa{at}cs.washington.edu
Understanding the complex and varied mechanisms that regulate gene expression is an important and challenging problem. A fundamental sub-problem is to identify DNA binding sites for unknown regulatory factors, given a collection of genes believed to be co-regulated. We discuss a computational method that identifies good candidates for such binding sites. Unlike local search techniques such as expectation maximization and Gibbs samplers that may not reach a global optimum, the method discussed enumerates all motifs in the search space, and is guaranteed to produce the motifs with greatest z-scores. We discuss the results of validation experiments in which this algorithm was used to identify candidate binding sites in several well studied regulons of Saccharomyces cerevisiae, where the most prominent transcription factor binding sites are largely known. We then discuss the results on gene families in the functional and mutant phenotype catalogs of S.cerevisiae, where the algorithm suggests many promising novel transcription factor binding sites. The program is available at http://bio.cs.washington.edu/software.html.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
W. W. Steiner, E. M. Steiner, A. R. Girvin, and L. E. Plewik Novel Nucleotide Sequence Motifs That Produce Hotspots of Meiotic Recombination in Schizosaccharomyces pombe Genetics, June 1, 2009; 182(2): 459 - 469. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Xie, J. Cai, N.-Y. Chia, H. H. Ng, and S. Zhong Cross-species de novo identification of cis-regulatory modules with GibbsModule: Application to gene regulation in embryonic stem cells Genome Res., August 1, 2008; 18(8): 1325 - 1335. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Linhart, Y. Halperin, and R. Shamir Transcription factor and microRNA motif discovery: The Amadeus platform and a compendium of metazoan target sets Genome Res., July 1, 2008; 18(7): 1180 - 1189. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Mercier, S. Watt, J. Bahler, and S. Labbe Key Function for the CCAAT-Binding Factor Php4 To Regulate Gene Expression in Response to Iron Deficiency in Fission Yeast Eukaryot. Cell, March 1, 2008; 7(3): 493 - 508. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Jiang, M. Q. Zhang, and X. Zhang OSCAR: One-class SVM for accurate recognition of cis-elements Bioinformatics, November 1, 2007; 23(21): 2823 - 2828. [Abstract] [Full Text] [PDF] |
||||
![]() |
A Tittarelli, L Milla, F Vargas, A Morales, C Neupert, L. Meisel, H Salvo-G, E Penaloza, G Munoz, L. Corcuera, et al. Isolation and comparative analysis of the wheat TaPT2 promoter: identification in silico of new putative regulatory motifs conserved between monocots and dicots J. Exp. Bot., July 1, 2007; 58(10): 2573 - 2582. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Chakravarty, J. M. Carlson, R. S. Khetani, C. E. DeZiel, and R. H. Gross SPACER: identification of cis-regulatory elements with non-contiguous critical residues Bioinformatics, April 15, 2007; 23(8): 1029 - 1031. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. Maerkl and S. R. Quake A Systems Approach to Measuring the Binding Energy Landscapes of Transcription Factors Science, January 12, 2007; 315(5809): 233 - 237. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. G. Perez, V. E. Angarica, A. T. R. Vasconcelos, and J. Collado-Vides Tractor_DB (version 2.0): a database of regulatory interactions in gamma-proteobacterial genomes Nucleic Acids Res., January 12, 2007; 35(suppl_1): D132 - D136. [Abstract] [Full Text] [PDF] |
||||
![]() |
N.-K. Kim, K. Tharakaraman, and J. L. Spouge Adding sequence context to a Markov background model improves the identification of regulatory elements Bioinformatics, December 1, 2006; 22(23): 2870 - 2875. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. M. Waters and B. L. Bassler The Vibrio harveyi quorum-sensing system uses shared regulatory components to discriminate between multiple autoinducers Genes & Dev., October 1, 2006; 20(19): 2754 - 2767. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. GuhaThakurta Computational identification of transcriptional regulatory elements in DNA sequence Nucleic Acids Res., July 19, 2006; 34(12): 3585 - 3598. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Kankainen, P. Pehkonen, P. Rosenstom, P. Toronen, G. Wong, and L. Holm POXO: a web-enabled tool series to discover transcription factor binding sites. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W534 - W540. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Rigoutsos, T. Huynh, K. Miranda, A. Tsirigos, A. McHardy, and D. Platt Short blocks from the noncoding parts of the human genome have instances within nearly all known genes and relate to biological processes PNAS, April 25, 2006; 103(17): 6605 - 6610. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Yu. Mitrophanov and M. Borodovsky Statistical significance in biological sequence analysis Brief Bioinform, March 1, 2006; 7(1): 2 - 24. |
||||
![]() |
K. D. MacIsaac, D. B. Gordon, L. Nekludova, D. T. Odom, J. Schreiber, D. K. Gifford, R. A. Young, and E. Fraenkel A hypothesis-based approach for identifying the binding specificity of regulatory proteins from chromatin immunoprecipitation data Bioinformatics, February 15, 2006; 22(4): 423 - 429. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Kasuga, J. P. Townsend, C. Tian, L. B. Gilbert, G. Mannhaupt, J. W. Taylor, and N. L. Glass Long-oligomer microarray profiling in Neurospora crassa reveals the transcriptional program underlying biochemical and physiological events of conidial germination Nucleic Acids Res., November 14, 2005; 33(20): 6469 - 6485. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Kankainen and L. Holm POCO: discovery of regulatory patterns from promoters of oppositely expressed gene sets Nucleic Acids Res., July 1, 2005; 33(suppl_2): W427 - W431. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Gupta and J. S. Liu De novo cis-regulatory module elicitation for eukaryotic genomes PNAS, May 17, 2005; 102(20): 7079 - 7084. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Mahony, D. Hendrix, A. Golden, T. J. Smith, and D. S. Rokhsar Transcription factor binding site identification using the self-organizing map Bioinformatics, May 1, 2005; 21(9): 1807 - 1814. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. D. Marinescu, I. S. Kohane, and A. Riva The MAPPER database: a multi-genome catalog of putative transcription factor binding sites Nucleic Acids Res., January 1, 2005; 33(suppl_1): D91 - D97. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Sumazin, G. Chen, N. Hata, A. D. Smith, T. Zhang, and M. Q. Zhang DWE: Discriminating Word Enumerator Bioinformatics, January 1, 2005; 21(1): 31 - 38. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Zhou and W. H. Wong CisModule: De novo discovery of cis-regulatory modules by hierarchical mixture modeling PNAS, August 17, 2004; 101(33): 12114 - 12119. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Pavesi, P. Mereghetti, G. Mauri, and G. Pesole Weeder Web: discovery of transcription factor binding sites in a set of sequences from co-regulated genes Nucleic Acids Res., July 1, 2004; 32(suppl_2): W199 - W203. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. L. Gelling, M. D. W. Piper, S.-P. Hong, G. D. Kornfeld, and I. W. Dawes Identification of a Novel One-carbon Metabolism Regulon in Saccharomyces cerevisiae J. Biol. Chem., February 20, 2004; 279(8): 7072 - 7081. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Marino-Ramirez, J. L. Spouge, G. C. Kanga, and D. Landsman Statistical analysis of over-represented words in human promoter sequences Nucleic Acids Res., February 12, 2004; 32(3): 949 - 958. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. A. Wray, M. W. Hahn, E. Abouheif, J. P. Balhoff, M. Pizer, M. V. Rockman, and L. A. Romano The Evolution of Transcriptional Regulation in Eukaryotes Mol. Biol. Evol., September 1, 2003; 20(9): 1377 - 1419. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sinha and M. Tompa YMF: a program for discovery of novel transcription factor binding sites by statistical overrepresentation Nucleic Acids Res., July 1, 2003; 31(13): 3586 - 3588. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Rombauts, K. Florquin, M. Lescot, K. Marchal, P. Rouze, and Y. Van de Peer Computational Approaches to Identify Promoters and cis-Regulatory Elements in Plant Genomes Plant Physiology, July 1, 2003; 132(3): 1162 - 1176. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. M. Hulzink, H. Weerdesteyn, A. F. Croes, T. Gerats, M. M. A. van Herpen, and J. van Helden In Silico Identification of Putative Regulatory Sequence Elements in the 5'-Untranslated Region of Genes That Are Expressed during Male Gametogenesis Plant Physiology, May 1, 2003; 132(1): 75 - 83. [Abstract] [Full Text] [PDF] |
||||












