| Nucleic Acids Research | Pages |
The Eukaryotic Promoter Database (EPD): recent developments
Database Description
Recent Developments
Cross-references to other databases
Improvement of the access procedures
SRS support for EPD
Access
Acknowledgement
References
The Eukaryotic Promoter Database (EPD): recent developments
ABSTRACT
DATABASE DESCRIPTION
EPD is a database of gene function which keeps track of experimental evidence defining the initiation sites of eukaryotic RNA POL II genes. This information is linked to promoter sequence data via machine readable pointers to corresponding positions in entries of the EMBL nucleotide sequence database (1). Note that EPD does not provide information on promoters in the sense of genetically defined transcription regulatory elements. Such information can be found in TRANSFAC and COMPEL (2) and other databases described in this issue.
EPD was originally designed as a resource for comparative sequence analysis and as such has played an instrumental role in the development of eukaryotic promoter prediction algorithms (3). Recently, its scope has been expanded to meet the requirements of the TRADAT project (see http://www.itba.mi.cnr.it/tradat/ ), a European consortium effort to develop integrated tools for the interpretation of genomic DNA sequences with emphasis on regulatory regions. The extensions comprise many cross-references to data collections covering other aspects of genes and promoters.
EPD is a rigorously selected, curated, and quality-controlled database. In order to be included, a promoter must fulfill a number of conditions laid down in the user manual. For instance, its transcription start site must be mapped with a certain accuracy and certainty, the corresponding gene must be functional, and corresponding sequence data must be available in the public databases. EPD is further confined to promoters recognized by the RNA POL II systems of higher eukaryotes, excluding fungi, algae and protists. However, since promoters are viewed as physiological elements dependent on the correct interpretation by a trans-acting environment, many viral promoters are included and classified with their host species.
A strict non-redundancy policy is applied based on the principle that one entry should correspond to one biological entity. Data from different literature sources pertaining to the same transcription initiation sites are thus always combined in a single entry. Orthologous promoters from different species are linked by internal cross-references, as are alternative promoters of the same gene. All information in EPD originates from a critical examination and independent interpretation of the experimental data presented in the cited research publications. Published conclusions and feature table annotations in EMBL sequence entries are never blindly relied upon.
A more detailed description of the contents and format of EPD has been published in last year's database issue (4).
RECENT DEVELOPMENTS
Cross-references to other databases
In order to facilitate the development of integrated tools within the TRADAT consortium, a concentrated effort has been made to add many new cross-references to related data collections (Table 1). This value-addition was made possible by the major format revision accomplished last year. Breaking with the former tradition that one promoter entry refers to a single EMBL sequence, an initiative was started to systematically cross-reference all EPD entries to all corresponding EMBL entries. The decision to change the former policy was taken when it was realized that many potential links between EPD and TRANSFAC were missed because the two databases referred to different EMBL entries containing the same promoter sequence. With one database comprehensively linked to EMBL, such omissions should no longer occur.
Table 1.
| Database | Number of links |
| EPD internal | 162 |
| EMBL (1) | 1849 |
| TRANSFAC (2) | 1157 |
| SWISS-PROT (7) | 929 |
| FLYBASE (8) | 127 |
| MIM (9) | 222 |
| MGD (10) | 44 |
| MEDLINE | 2126 |
Improvement of the access procedures
Figure 1. EPD sequence download form. The user can select promoter subsets from various species and higher order taxonomic groups. If the `Representative set' option at the bottom is actived, a maximal set of representative sequences including no pair with more than 50% sequence identity will be retrieved. Figure 2. EPD query form and entry access procedures. (a) Query form. The design was inspired by the SRS query form at EBI. In the example shown, the user tries to find EPD promoter entries corresponding to human oncogenes. (b) Query results. The lower part shows the beginning of the list of entries satisfying the specified criteria, from which the user can select those he is interested in. The upper part offers several alternative output options. The selected entries can either be viewed by the browser, or corresponding sequences can be downloaded to a local file. Here, the user attempts to download promoter sequences from the human c-fos and c-myc oncogenes extending from positions -99 to 20 relative to the transcription start site. (c) Contents of the sequence file downloaded by the previous operation. Note that the sequences are in FASTA format as requested by the user. The WWW-based user interfaces have been improved in several ways. The sequence download page, which, for instance, can be used for retrieving training sets for promoter prediction algorithms, is shown in Figure
SRS support for EPD
The existing Icarus configuration files used by SRS version 5 (5) were extensively modified in order to exploit the features of the new EPD format introduced last year. The new versions of these scripts are available from the ftp address given below (files epd.i, epd.is and epd.it). The SRS indexing system takes advantage of many of the recently introduced new fields, especially the cross-references to other databases.
ACCESS
EPD is distributed and maintained as a single ASCII flat file which can be obtained via anonymous ftp from ftp.epd.isb-sib.ch/pub/databases/epd. The following additional files are available:
(i) Sequence containing views in EMBL and FASTA format. These files contain promoter sequences in a range from -499 to +100 relative to the transcription start site plus excerpts from the promoter annotation.
(ii) A slightly reduced version of EPD in ASN.1 format designed for import into the GenBank-Entrez data environment (6).
(iii) Documentation files including the EPD user manual and a formal data description of the ASN.1 version.
(iv) Icarus scripts for indexing EPD by SRS.
The URL for online access to EPD is: http://www.epd.isb-sib.ch . This site offers the following services:
(i) Access to EPD entries by ID or accession number. The following formats are offered: text only, HTML, and HTML combined with a graphic representation of sequence objects by a Java applet (Junier & Bucher 1998, http://www.bioinfo.de/isb/1998/01/0003/ ).
(ii) A page for downloading promoter sequence subsets defined in EPD (for instance all human promoters from 100 bases upstream to 100 bases downstream of the initiation site).
(iii) Access to EPD entries or corresponding promoter sequences via a query form allowing for field-restricted character string searches.
SRS access to EPD is available at the Swiss EMBNet node: http://www.ch.embnet.org/
ACKNOWLEDGEMENT
EPD is funded in part by grant 95.0236-1 from the Swiss Federal Office for Education and Research.
REFERENCES
This article has been cited by other articles:
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 9 Dec 1998
Copyright©Oxford University Press, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
![]()
![]()

![]()
![]()
![]()
C. D. Schmid, R. Perier, V. Praz, and P. Bucher
EPD in its twentieth year: towards complete promoter coverage of selected model organisms
Nucleic Acids Res.,
January 1, 2006;
34(suppl_1):
D82 - D85.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
C. D. Schmid, V. Praz, M. Delorenzi, R. Perier, and P. Bucher
The Eukaryotic Promoter Database EPD: the impact of in silico primer extension
Nucleic Acids Res.,
January 1, 2004;
32(90001):
D82 - 85.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
G. Grillo, F. Licciulli, S. Liuni, E. Sbisa, and G. Pesole
PatSearch: a program for the detection of patterns and structural motifs in nucleotide sequences
Nucleic Acids Res.,
July 1, 2003;
31(13):
3608 - 3612.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
I. Liebich, J. Bode, I. Reuter, and E. Wingender
Evaluation of sequence motifs found in scaffold/matrix-attached regions (S/MARs)
Nucleic Acids Res.,
August 1, 2002;
30(15):
3433 - 3442.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
C. Abrescia, E. De Gregorio, M. Frontini, R. Mantovani, and P. Di Nocera
A Novel Intragenic Sequence Enhances Initiator-dependent Transcription in Human Embryonic Kidney 293 Cells
J. Biol. Chem.,
May 24, 2002;
277(22):
19594 - 19599.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
V. Praz, R. Perier, C. Bonnard, and P. Bucher
The Eukaryotic Promoter Database, EPD: new entry types and links to gene expression data
Nucleic Acids Res.,
January 1, 2002;
30(1):
322 - 324.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
N. Elrouby and T. E. Bureau
Molecular Characterization of the Abp1 5'-Flanking Region in Maize and the Teosintes
Plant Physiology,
September 1, 2000;
124(1):
369 - 378.
[Abstract]
[Full Text]
![]()
![]()
![]()

![]()
![]()
![]()
A. Knutson, E. Castano, T. Oelgeschlager, R. G. Roeder, and G. Westin
Downstream Promoter Sequences Facilitate the Formation of a Specific Transcription Factor IID-Promoter Complex Topology Required for Efficient Transcription from the Megalin/Low Density Lipoprotein Receptor-related Protein 2 Promoter
J. Biol. Chem.,
May 5, 2000;
275(19):
14190 - 14197.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
M. G. Reese, G. Hartzell, N. L. Harris, U. Ohler, J. F. Abril, and S. E. Lewis
Genome Annotation Assessment in Drosophila melanogaster
Genome Res.,
April 1, 2000;
10(4):
483 - 501.
[Abstract]
[Full Text]
![]()
![]()
![]()

![]()
![]()
![]()
N. J. Schisler and J. D. Palmer
The IDB and IEDB: intron sequence and evolution databases
Nucleic Acids Res.,
January 1, 2000;
28(1):
181 - 184.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
R. C. Perier, V. Praz, T. Junier, C. Bonnard, and P. Bucher
The Eukaryotic Promoter Database (EPD)
Nucleic Acids Res.,
January 1, 2000;
28(1):
302 - 303.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. Ghosh
Object-oriented Transcription Factors Database (ooTFD)
Nucleic Acids Res.,
January 1, 2000;
28(1):
308 - 310.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
O. V. Kel-Margoulis, A. G. Romashchenko, N. A. Kolchanov, E. Wingender, and A. E. Kel
COMPEL: a database on composite regulatory elements providing combinatorial transcriptional regulation
Nucleic Acids Res.,
January 1, 2000;
28(1):
311 - 315.
[Abstract]
[Full Text]
[PDF]
![]()
This Article ![]()
![]()
Abstract
![]()
Print PDF (140K)
![]()
Alert me when this article is cited
![]()
Alert me if a correction is posted
![]()
Services ![]()
![]()
Email this article to a friend
![]()
Similar articles in this journal
![]()
Similar articles in ISI Web of Science
![]()
Similar articles in PubMed
![]()
Alert me to new issues of the journal
![]()
Add to My Personal Archive
![]()
Download to citation manager
![]()
Search for citing articles in:
ISI Web of Science (23)
![]()
Request Permissions ![]()
Commercial Re-use Guidelines
for Open Access NAR Content
![]()
Google Scholar ![]()
![]()
Articles by Perier, R. C.
![]()
Articles by Bucher, P.
![]()
Search for Related Content
![]()
PubMed ![]()
![]()
PubMed Citation
![]()
Articles by Perier, R. C.
![]()
Articles by Bucher, P.
![]()
Social Bookmarking ![]()
![]()
What's this?