Published online 10 March 2005
Article |
NestedMICA: sensitive inference of over-represented motifs in nucleic acid sequence
Wellcome Trust Sanger Institute, Hinxton Cambridge CB10 1SA, UK
*To whom correspondence should be addressed. Tel: +44 1223 834244; Fax: +44 1223 494919; Email: td2{at}sanger.ac.uk
Received December 14, 2004. Revised January 24, 2005. Accepted February 15, 2005.
NestedMICA is a new, scalable, pattern-discovery system for finding transcription factor binding sites and similar motifs in biological sequences. Like several previous methods, NestedMICA tackles this problem by optimizing a probabilistic mixture model to fit a set of sequences. However, the use of a newly developed inference strategy called Nested Sampling means NestedMICA is able to find optimal solutions without the need for a problematic initialization or seeding step. We investigate the performance of NestedMICA in a range scenario, on synthetic data and a well-characterized set of muscle regulatory regions, and compare it with the popular MEME program. We show that the new method is significantly more sensitive than MEME: in one case, it successfully extracted a target motif from background sequence four times longer than could be handled by the existing program. It also performs robustly on synthetic sequences containing multiple significant motifs. When tested on a real set of regulatory sequences, NestedMICA produced motifs which were good predictors for all five abundant classes of annotated binding sites.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
V. X. Jin, J. Apostolos, N. S. V. R. Nagisetty, and P. J. Farnham W-ChIPMotifs: a web application tool for de novo motif discovery from ChIP-based high-throughput data Bioinformatics, December 1, 2009; 25(23): 3191 - 3193. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Collart, J. M. Ramis, T. A. Down, and J. C. Smith Smicl is required for phosphorylation of RNA polymerase II and affects 3'-end processing of RNA at the midblastula transition in Xenopus Development, October 15, 2009; 136(20): 3451 - 3461. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. W. Bruce, A. J. Lopez-Contreras, P. Flicek, T. A. Down, P. Dhami, S. C. Dillon, C. M. Koch, C. F. Langford, I. Dunham, R. M. Andrews, et al. Functional diversity for REST (NRSF) is defined by in vivo binding affinity hierarchies at the DNA sequence level Genome Res., June 1, 2009; 19(6): 994 - 1005. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. H. Morley, K. Lachani, D. Keefe, M. J. Gilchrist, P. Flicek, J. C. Smith, and F. C. Wardle A gene regulatory network directed by zebrafish No tail accounts for its roles in mesoderm formation PNAS, March 10, 2009; 106(10): 3829 - 3834. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. B. Rose, T. Elfersi, G. Parra, and I. Korf Promoter-Proximal Introns in Arabidopsis thaliana Are Enriched in Dispersed Signals that Elevate Gene Expression PLANT CELL, March 1, 2008; 20(3): 543 - 551. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. X. Jin, H. O'Geen, S. Iyengar, R. Green, and P. J. Farnham Identification of an OCT4 and SRY regulatory module using integrated computational and experimental genomics approaches Genome Res., June 1, 2007; 17(6): 807 - 817. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. S. Jackson and W. J. Fitzgerald A sequential Monte Carlo EM approach to the transcription factor binding site identification problem Bioinformatics, June 1, 2007; 23(11): 1313 - 1320. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. J. Donaldson and B. Gottgens CoMoDis: composite motif discovery in mammalian genomes Nucleic Acids Res., January 12, 2007; 35(1): e1 - e1. [Abstract] [Full Text] [PDF] |
||||
![]() |
C.-H. Peng, J.-T. Hsu, Y.-S. Chung, Y.-J. Lin, W.-Y. Chow, D. F. Hsu, and C. Y. Tang Identification of degenerate motifs using position restricted selection and hybrid ranking combination Nucleic Acids Res., December 2, 2006; 34(22): 6379 - 6391. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Elnitski, V. X. Jin, P. J. Farnham, and S. J.M. Jones Locating mammalian transcription factor binding sites: A survey of computational and experimental techniques Genome Res., December 1, 2006; 16(12): 1455 - 1464. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Abnizova and W. R. Gilks Studying statistical properties of regulatory DNA sequences, and their use in predicting regulatory regions in the eukaryotic genomes Brief Bioinform, March 1, 2006; 7(1): 48 - 54. [Abstract] [Full Text] [PDF] |
||||






