Nucleic Acids Research, 2003, Vol. 31, No. 13 3540-3545
© 2003 Oxford University Press
PromH: promoters identification using orthologous genomic sequences
Softberry Inc., 116 Radio Circle, Suite 400, Mount Kisco, NY 10549, USA 1 Institute of Botany, Azerbaijan National Academy of Sciences, 370073 Baku, Azerbaijan
*To whom correspondence should be addressed. Tel: +1 914 242 3592; Fax: +1 914 242 3593; Email: victor{at}softberry.com
Present address: I. A. Shahmuradov, Royal Holloway, University of London, Egham, Surrey TW20 0EX, UK
Accurate prediction of promoters is fundamental for understanding gene expression patterns, cell specificity and development. In the studies of conserved features of regulatory regions of orthologous genes, it was observed that major promoter functional components such as transcription start points, TATA-boxes and regulatory motifs, are significantly more conservative than the sequences around them (70100% compared with 3050%). To improve promoter identification accuracy, we employed these findings in a new program, PromH, created by extending the TSSW program feature set. PromH uses linear discriminant functions that take into account conservation features and nucleotide sequences of promoter regions in pairs of orthologous genes. The program was tested on two sets of pairs of orthologous, mostly human and rodent, sequences with known transcription start sites (TSS), annotated to have TATA (21 genes, 11 orthologous pairs) and TATA-less (38 genes, 19 pairs) promoters, respectively. The program correctly predicted TSS for all 21 genes of the first set with a median deviation of 2 bp from true site location. Only for two genes, was there significant (46 and 105 bp) discrepancy between predicted and annotated TSS positions. For 38 TATA-less promoters from the second set, TSS was predicted for 27 genes, in 14 cases within 10 bp distance from annotated TSS, and in 21 caseswithin 100 bp distance. Despite more discrepancies between predicted and annotated TSS for genes from the second set, these results are consistent with observations of much higher occurrence of multiple TSS in TATA-less promoters. In any case, our results show that PromH identifies TSS positions significantly more accurately than any other published promoter prediction method. The PromH program is available at http://www.softberry.com/berry.phtml?topic=promh.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. Damert, J. Raiz, A. V. Horn, J. Lower, H. Wang, J. Xing, M. A. Batzer, R. Lower, and G. G. Schumann 5'-Transducing SVA retrotransposon groups spread efficiently throughout the human genome Genome Res., November 1, 2009; 19(11): 1992 - 2008. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Gan, J. Guan, and S. Zhou A pattern-based nearest neighbor search approach for promoter prediction using DNA structural profiles Bioinformatics, August 15, 2009; 25(16): 2006 - 2012. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Perez, F. Lankas, F. J. Luque, and M. Orozco Towards a molecular dynamics consensus view of B-DNA flexibility Nucleic Acids Res., April 1, 2008; 36(7): 2379 - 2394. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Abeel, Y. Saeys, E. Bonnet, P. Rouze, and Y. Van de Peer Generic eukaryotic core promoter prediction using structural features of DNA Genome Res., February 1, 2008; 18(2): 310 - 323. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Xie, S. Wu, K.-M. Lam, and H. Yan PromoterExplorer: an effective promoter identification method based on the AdaBoost algorithm Bioinformatics, November 15, 2006; 22(22): 2722 - 2728. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. V. Vishnevsky and N. A. Kolchanov ARGO: a web system for the detection of degenerate motifs and large-scale recognition of eukaryotic promoters Nucleic Acids Res., July 1, 2005; 33(suppl_2): W417 - W422. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Burden, Y.-X. Lin, and R. Zhang Improving promoter prediction Improving promoter prediction for the NNPP2.2 algorithm: a case study using Escherichia coli DNA sequences Bioinformatics, March 1, 2005; 21(5): 601 - 607. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. A. Shahmuradov, V. V. Solovyev, and A. J. Gammerman Plant promoter prediction with confidence estimation Nucleic Acids Res., February 18, 2005; 33(3): 1069 - 1076. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Barta, E. Sebestyen, T. B. Palfy, G. Toth, C. P. Ortutay, and L. Patthy DoOP: Databases of Orthologous Promoters, collections of clusters of orthologous upstream sequences from chordates and plants Nucleic Acids Res., January 1, 2005; 33(suppl_1): D86 - D90. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. W. Beaber and M. K. Waldor Identification of Operators and Promoters That Control SXT Conjugative Transfer J. Bacteriol., September 1, 2004; 186(17): 5945 - 5949. [Abstract] [Full Text] [PDF] |
||||



