Nucleic Acids Research, 2000, Vol. 28, No. 1 302-303
© 2000 Oxford University Press
The Eukaryotic Promoter Database (EPD)
Swiss Institute of Bioinformatics and Swiss Institute for Experimental Cancer Research, Ch. des Boveresses 155, 1066-Epalinges s/Lausanne, Switzerland
Received October 6, 1999; Accepted October 8, 1999.
| ABSTRACT |
|---|
|
|
|---|
The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of eukaryotic POL II promoters for which the transcription start site has been determined experimentally. Access to promoter sequences is provided by pointers to positions in nucleotide sequence entries. The annotation part of an entry includes a description of the initiation site mapping data, exhaustive cross-references to the EMBL nucleotide sequence database, SWISS-PROT, TRANSFAC and other databases, as well as bibliographic references. EPD is structured in a way that facilitates dynamic extraction of biologically meaningful promoter subsets for comparative sequence analysis. WWW-based interfaces have been developed that enable the user to view EPD entries in different formats, to select and extract promoter sequences according to a variety of criteria, and to navigate to related databases exploiting different cross-references. The EPD web site also features yearly updated base frequency matrices for major eukaryotic promoter elements. EPD can be accessed at http://www.epd.isb-sib.ch
| DATABASE DESCRIPTION |
|---|
|
|
|---|
The term promoter has two different meanings in biology: (i) a gene region immediately upstream of a transcription initiation site, and (ii) a cis-acting genetic element controlling the rate of transcription initiation of a gene. The Eukaryotic Promoter Database (EPD) is a database of promoters in the former sense. Information about promoters in the latter sense can be found in other databases such as TRANSFAC (1), ooTFD (2), TRRD (3), PlantCARE (4) and PLACE (5).
EPD was originally designed as a resource for comparative sequence analysis and, as such, has played an instrumental role in the characterization of eukaryotic transcription control elements (6,7), as well as in the development of eukaryotic promoter prediction algorithms (8). The main purpose of the database is to keep track of experimental data that define transcription initiation sites of eukaryotic genes. This type of functional information is linked to promoter sequences via machine-readable pointers to positions within sequences of the EMBL nucleotide sequence database (9).
EPD is a rigorously selected, curated and quality-controlled database. In order to be included, a promoter must fulfill a number of conditions laid down in the user manual. Most importantly, the transcription start site must be mapped experimentally with an estimated precision of ±5 bp or higher. All information in EPD originates from a critical examination and independent interpretation of the experimental data presented in the cited research publications. Published conclusions and feature table annotations in EMBL entries are never blindly relied upon. At present, EPD is confined to promoters recognized by the RNA POL II system of higher eukaryotes (multicellular plants and animals). Note that this restriction does not a priori exclude viral promoters.
EPD is also a strictly non-redundant database. The general rule is that one entry corresponds to one transcription initiation site in a genome. Organisms are distinguished at the taxonomic level of the species. According to this policy, data from different literature sources pertaining to the same transcription initiation sites are represented by the same entry. Likewise, promoters belonging to different alleles of the same gene, or to the same gene in different subspecies, are covered by the same entry regardless of whether they differ in sequence. The user manual provides more details about how certain non-trivial cases such as promoters of tandemly repeated genes or retrotransposable elements, are handled.
A comprehensive description of the contents and format of EPD has been published earlier (10). User interfaces and software support for local installations have been previously described (11).
| RECENT DEVELOPMENTS |
|---|
|
|
|---|
Database
The objective of exhaustive cross-referencing between EPD promoters and EMBL sequences is being given high priority at the moment, especially with regard to genomes that are complete (Caenorhabditis elegans) or at an advanced stage of sequencing (Arabidopsis, Drosophila, human). As a consequence, the number of EMBL cross-references has increased by >1000 since last year (Table 1). Moreover, the internal EPD cross-references have been revised. Until now, such links were only used to connect alternative promoters of the same gene. In future releases, promoters of different genes occurring at a short distance from each other (<1000 bp), will also be cross-referenced. Such pairs of promoters usually promote transcription in opposite directions and often share upstream regulatory elements. As a new format feature, a keyword (KW) line type has been introduced and so far been populated with keywords imported from SWISS-PROT (12). This feature is intended to enhance the query capabilities of various access tools. Additional keywords relating to properties of the promoter rather than to properties of the corresponding gene product will be added in the near future.
|
Documentation
The user manual has been extensively revised. Bibliographic references have been added to the section explaining the representation of transcript mapping data. Some of them are accompanied by direct hyperlinks to figures in online journals exemplifying a particular technique. Several additional documents have recently been made available over the web. One contains a list of all homology groups defined in EPD. Such groups consist of homologous promoters exhibiting significant sequence similarity in the 79 to +20 region among themselves. Another document presents the hierarchical promoter classification system of EPD.
Promoter element descriptions
Weight matrix descriptions of four major eukaryotic promoter elements (TATA-box, initiator, GC-box and CCAAT-box) have previously been derived from EPD release 17 (7). We have now decided to make updated versions of such matrices available on a yearly basis from the EPD web pages. The latest versions were produced from EPD release 60 using a BaumWelch hidden Markov model training algorithm (program buildmodel of SAM release 1.3.3, Hughey & Krogh 1998, http://www. cse.ucsc.edu/research/compbio/sam.html ).
| ACCESS |
|---|
|
|
|---|
FTP
The following files are available from ftp.epd.isb-sib.ch/pub/databases/epd
Flat-files containing the EPD database in the new and in the old format.
EPD user manual.
Sequence libraries in EMBL and FASTA format containing promoter sequences from 499 to +100 relative to the transcription start site.
A slightly reduced version of EPD in ASN.1 format designed for import into the GenBankEntrez data environment (13), including a formal data description in ASN1.
Icarus scripts for indexing EPD by SRS (14).
WWW
The following services are offered at http://www.epd. isb-sib.ch
Access to EPD entries by ID or accession number. The following formats are available: text only, HTML and HTML combined with a graphic representation of sequence objects by a Java applet (15).
A page for downloading promoter sequence subsets defined in EPD.
Access to EPD entries and corresponding promoter sequences via a query form.
Access to EPD via SRS is provided by the Swiss EMBNet node at http://www.ch.embnet.org/
| SUPPLEMENTARY MATERIAL |
|---|
|
|
|---|
Relevant URL links are available at NAR Online.
| ACKNOWLEDGEMENT |
|---|
EPD is funded in part by grant 31-54782.98 from the Swiss National Science Foundation.
| FOOTNOTES |
|---|
* To whom correspondence should be addressed. Tel: +41 21 692 5892; Fax: +41 21 692 5945; Email: philipp.bucher@isrec.unil.ch
| REFERENCES |
|---|
|
|
|---|
-
1 Heinemeyer,T., Chen,X., Karas,H., Kel,A.E., Kel,O.V., Liebich,I., Meinhardt,T., Reuter,I., Schacherer,F. and Wingender,E. (1999) Nucleic Acids Res., 27, 318322. Updated article in this issue: Nucleic Acids Res. (2000), 28, 316319.
2 Ghosh,D. (1999) Nucleic Acids Res., 27, 315317. Updated article in this issue: Nucleic Acids Res. (2000), 28, 308310.
3 Kolchanov,N.A., Ananko,E.A., Podkolodnaya,O.A., Ignatieva,E.V., Stepanenko,I.L., Kel-Margoulis,O.V., Kel,A.E., Merkulova,T.I., Goryachkovskaya,T.N., Busygina,T.V., Kolpakov,F.A., Podkolodny,N.L., Naumochkin,A.N. and Romashchenko,A.G. (1999) Nucleic Acids Res., 27, 303306. Updated article in this issue: Nucleic Acids Res. (2000), 28, 298301.
4 Rombauts,S., Dehais,P., Van Montagu,M. and Rouze,P. (1999) Nucleic Acids Res., 27, 295296.
5 Higo,K., Ugawa,Y., Iwamoto,M. and Korenaga,T. (1999) Nucleic Acids Res., 27, 297300.
6 Bucher,P. and Trifonov,E.N. (1986) Nucleic Acids Res., 22, 1000910026.
7 Bucher,P. (1990) J. Mol. Biol., 212, 563578.[ISI][Medline]
8 Fickett,J.W. and Hatzigeorgiou,A.G. (1997) Genome Res., 7, 861878.
9 Stoesser,G., Tuli,M.A., Lopez,R. and Sterk,P. (1999) Nucleic Acids Res., 27, 1824. Updated article in this issue: Nucleic Acids Res. (2000), 28, 1923.
10 Cavin Périer,R., Junier,T. and Bucher,P. (1998) Nucleic Acids Res., 26, 353357.
11 Cavin Périer,R., Junier,T., Bonnard,C. and Bucher,P. (1999) Nucleic Acids Res., 27, 307309.
12 Bairoch,A. and Apweiler,R. (1999) Nucleic Acids Res., 27, 4954. Updated article in this issue: Nucleic Acids Res. (2000), 28, 4548.
13 Benson,D.A., Boguski,M., Lipman,D.J. and Ostell,J. (1994) Nucleic Acids Res., 22, 34413444.
14 Etzold,T., Ulyanov,A. and Argos,P. (1996) Methods Enzymol., 266, 114128.[ISI][Medline]
15 Junier,T. and Bucher,P. (1998) In Silico Biol., 1, 1320.[Medline]
16 The Flybase Consortium (1999) Nucleic Acids Res., 27, 8588.
17 Pearson,P., Francomano,C., Foster,P., Bocchini,C., Li,P. and McKusick,V. (1994) Nucleic Acids Res., 22, 34703473.
18 Blake,J.A., Richardson,J.E., Davisson,M.T. and Eppig,J.T. (1999) Nucleic Acids Res., 27, 9598. Updated article in this issue: Nucleic Acids Res. (2000), 28, 108111.
This article has been cited by other articles:
![]() |
H. Faiger, M. Ivanchenko, and T. E. Haran Nearest-neighbor non-additivity versus long-range non-additivity in TATA-box structure and its implications for TBP-binding mechanism Nucleic Acids Res., July 26, 2007; 35(13): 4409 - 4419. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. E. Reddy, B. E. Shakhnovich, D. S. Roberts, S. J. Russek, and C. DeLisi Positional clustering improves computational binding site detection and identifies novel cis-regulatory sites in mammalian GABAA receptor subunit genes Nucleic Acids Res., February 16, 2007; 35(3): e20 - e20. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. A. Ariazi, R. J. Kraus, M. L. Farrell, V. C. Jordan, and J. E. Mertz Estrogen-Related Receptor {alpha}1 Transcriptional Activities Are Regulated in Part via the ErbB2/HER2 Signaling Pathway Mol. Cancer Res., January 1, 2007; 5(1): 71 - 85. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. J. Stewart, J. A. Fischbeck, X. Chen, and L. A. Stargell Non-optimal TATA Elements Exhibit Diverse Mechanistic Consequences J. Biol. Chem., August 11, 2006; 281(32): 22665 - 22673. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Khatri, V. Desai, A. L. Tarca, S. Sellamuthu, D. E. Wildman, R. Romero, and S. Draghici New Onto-Tools: Promoter-Express, nsSNPCounter and Onto-Translate. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W626 - W631. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. K. Palaniswamy, S. James, H. Sun, R. S. Lamb, R. V. Davuluri, and E. Grotewold AGRIS and AtRegNet. A Platform to Link cis-Regulatory Elements and Transcription Factors into Regulatory Networks Plant Physiology, March 1, 2006; 140(3): 818 - 829. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. D. Schmid, R. Perier, V. Praz, and P. Bucher EPD in its twentieth year: towards complete promoter coverage of selected model organisms Nucleic Acids Res., January 1, 2006; 34(suppl_1): D82 - D85. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Deng and S. G.E. Roberts A core promoter element downstream of the TATA box that is recognized by TFIIB Genes & Dev., October 15, 2005; 19(20): 2418 - 2423. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Di Cara, K. Schmidt, B. A. Hemmings, and E. J. Oakeley PromoterPlot: a graphical display of promoter similarities by pattern recognition Nucleic Acids Res., July 1, 2005; 33(suppl_2): W423 - W426. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Khatri, S. Sellamuthu, P. Malhotra, K. Amin, A. Done, and S. Draghici Recent additions and improvements to the Onto-Tools Nucleic Acids Res., July 1, 2005; 33(suppl_2): W762 - W765. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Kamalakaran, S. K. Radhakrishnan, and W. T. Beck Identification of Estrogen-responsive Genes Using a Genome-wide Analysis of Promoter Elements for Transcription Factor Binding Sites J. Biol. Chem., June 3, 2005; 280(22): 21491 - 21497. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. M. Robinson, G. Yatherajam, R. T. Ranallo, A. Bric, M. R. Paule, and L. A. Stargell Mapping and Functional Characterization of the TAF11 Interaction with TFIIA Mol. Cell. Biol., February 1, 2005; 25(3): 945 - 957. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Zhao, Z. Xuan, L. Liu, and M. Q. Zhang TRED: a Transcriptional Regulatory Element Database and a platform for in silico gene regulation studies Nucleic Acids Res., January 1, 2005; 33(suppl_1): D103 - D107. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. S. Bindra, P. J. Schaffer, A. Meng, J. Woo, K. Maseide, M. E. Roth, P. Lizardi, D. W. Hedley, R. G. Bristow, and P. M. Glazer Down-Regulation of Rad51 and Decreased Homologous Recombination in Hypoxic Cancer Cells Mol. Cell. Biol., October 1, 2004; 24(19): 8504 - 8518. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Fuchs, V. Heib, L. Kiger, M. Haberkamp, A. Roesner, M. Schmidt, D. Hamdane, M. C. Marden, T. Hankeln, and T. Burmester Zebrafish Reveals Different and Conserved Features of Vertebrate Neuroglobin Gene Structure, Expression Pattern, and Ligand Binding J. Biol. Chem., June 4, 2004; 279(23): 24116 - 24122. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. D. Schmid, V. Praz, M. Delorenzi, R. Perier, and P. Bucher The Eukaryotic Promoter Database EPD: the impact of in silico primer extension Nucleic Acids Res., January 1, 2004; 32(90001): D82 - 85. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Hoogendoorn, S. L. Coleman, C. A. Guy, K. Smith, T. Bowen, P. R. Buckland, and M. C. O'Donovan Functional analysis of human promoter polymorphisms Hum. Mol. Genet., September 15, 2003; 12(18): 2249 - 2254. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Johannessen, P. A. Olsen, R. Sorensen, B. Johansen, O. M. Seternes, and U. Moens A role of the TATA box and the general co-activator hTAFII130/135 in promoter-specific trans-activation by simian virus 40 small t antigen J. Gen. Virol., July 1, 2003; 84(7): 1887 - 1897. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Solovyev and I. Shahmuradov PromH: promoters identification using orthologous genomic sequences Nucleic Acids Res., July 1, 2003; 31(13): 3540 - 3545. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Chong, G. Zhang, and V. B. Bajic FIE2: a program for the extraction of genomic DNA sequences around the start and translation initiation site of human genes Nucleic Acids Res., July 1, 2003; 31(13): 3546 - 3553. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. WERNER, S. FESSELE, H. MAIER, and P. J. NELSON Computer modeling of promoter organization as a tool to study transcriptional coregulation FASEB J, July 1, 2003; 17(10): 1228 - 1237. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Liu, R. C. McEachin, and D. J. States Computationally Identifying Novel NF-kappa B-Regulated Immune Genes in the Human Genome Genome Res., April 1, 2003; 13(4): 654 - 661. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. C. Frith, J. L. Spouge, U. Hansen, and Z. Weng Statistical significance of clusters of motifs represented by position specific scoring matrices in nucleotide sequences Nucleic Acids Res., July 15, 2002; 30(14): 3214 - 3224. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Gordon-Shaag, O. Ben-Nun-Shaul, V. Roitman, Y. Yosef, and A. Oppenheim Cellular Transcription Factor Sp1 Recruits Simian Virus 40 Capsid Proteins to the Viral Packaging Signal, ses J. Virol., May 13, 2002; 76(12): 5915 - 5924. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. A. Down and T. J. P. Hubbard Computational Detection and Location of Transcription Start Sites in Mammalian Genomic DNA Genome Res., March 1, 2002; 12(3): 458 - 461. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Stoesser, W. Baker, A. van den Broek, E. Camon, M. Garcia-Pastor, C. Kanz, T. Kulikova, R. Leinonen, Q. Lin, V. Lombard, et al. The EMBL Nucleotide Sequence Database Nucleic Acids Res., January 1, 2002; 30(1): 21 - 26. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Praz, R. Perier, C. Bonnard, and P. Bucher The Eukaryotic Promoter Database, EPD: new entry types and links to gene expression data Nucleic Acids Res., January 1, 2002; 30(1): 322 - 324. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Scherf, A. Klingenhoff, K. Frech, K. Quandt, R. Schneider, K. Grote, M. Frisch, V. Gailus-Durner, A. Seidel, R. Brack-Werner, et al. First Pass Annotation of Promoters on Human Chromosome 22 Genome Res., March 1, 2001; 11(3): 333 - 340. [Abstract] [Full Text] |
||||
![]() |
G. Stoesser, W. Baker, A. van den Broek, E. Camon, M. Garcia-Pastor, C. Kanz, T. Kulikova, V. Lombard, R. Lopez, H. Parkinson, et al. The EMBL nucleotide sequence database Nucleic Acids Res., January 1, 2001; 29(1): 17 - 21. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. G. Reese, G. Hartzell, N. L. Harris, U. Ohler, J. F. Abril, and S. E. Lewis Genome Annotation Assessment in Drosophila melanogaster Genome Res., April 1, 2000; 10(4): 483 - 501. [Abstract] [Full Text] |
||||
![]() |
Y. Suzuki, T. Tsunoda, J. Sese, H. Taira, J. Mizushima-Sugano, H. Hata, T. Ota, T. Isogai, T. Tanaka, Y. Nakamura, et al. Identification and Characterization of the Potential Promoter Regions of 1031 Kinds of Human Genes Genome Res., May 1, 2001; 11(5): 677 - 684. [Abstract] [Full Text] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||










