Published online 23 September 2004
Nucleic Acids Research, Vol. 32 No. 17 © Oxford University Press 2004; all rights reserved
Bipartite pattern discovery by entropy minimization-based multiple local alignment
1 Laboratory of Human Molecular Genetics, Children's Mercy Hospital & Clinics, 2401 Gillham Road, Kansas City, MO 64108, USA, 2 School of Computer Science and Engineering and 3 School of Medicine, University of MissouriKansas City, Kansas City, MO 64110, USA
* To whom correspondence should be addressed. Tel: +1 816 983 6511; Fax: +1 816 983 6515; Email: progan{at}cmh.edu
Received June 25, 2004; Revised August 11, 2004; Accepted August 26, 2004
Many multimeric transcription factors recognize DNA sequence patterns by cooperatively binding to bipartite elements composed of half sites separated by a flexible spacer. We developed a novel bipartite algorithm, bipartite pattern discovery (Bipad), which produces a mathematical model based on information maximization or Shannon's entropy minimization principle, for discovery of bipartite sequence patterns. Bipad is a C++ program that applies greedy methods to search the bipartite alignment space and examines the upstream or downstream regions of co-regulated genes, looking for cis-regulatory bipartite patterns. An input sequence file with zero or one site per locus is required, and the left and right motif widths and a range of possible gap lengths must be specified. Bipad can run in either single-block or bipartite pattern search modes, and it is capable of comprehensively searching all four orientations of half-site patterns. Simulation studies showed that the accuracy of this motif discovery algorithm depends on sample size and motif conservation level, but results were independent of background composition. Bipad performed equivalent with or better than other pattern search algorithms in correctly identifying Escherichia coli cyclic AMP receptor protein and Bacillus subtilis sigma factor binding site sequences based on experimentally defined benchmarks. Finally, a new bipartite information weight matrix for vitamin D3 receptor/retinoid X receptor
(VDR/RXR
) binding sites was derived that comprehensively models the natural variability inherent in these sequence elements.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
C.-Y. Chen, H.-K. Tsai, C.-M. Hsu, M.-J. May Chen, H.-G. Hung, G. T.-W. Huang, and W.-H. Li Discovering gapped binding sites of yeast transcription factors PNAS, February 19, 2008; 105(7): 2527 - 2532. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Chakravarty, J. M. Carlson, R. S. Khetani, C. E. DeZiel, and R. H. Gross SPACER: identification of cis-regulatory elements with non-contiguous critical residues Bioinformatics, April 15, 2007; 23(8): 1029 - 1031. [Abstract] [Full Text] [PDF] |
||||
![]() |
T.-T. Wang, L. E. Tavera-Mendoza, D. Laperriere, E. Libby, N. Burton MacLeod, Y. Nagai, V. Bourdeau, A. Konstorum, B. Lallemant, R. Zhang, et al. Large-Scale in Silico and Microarray-Based Identification of Direct 1,25-Dihydroxyvitamin D3 Target Genes Mol. Endocrinol., November 1, 2005; 19(11): 2685 - 2695. [Abstract] [Full Text] [PDF] |
||||


