Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (140K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (23)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Perier, R. C.
Right arrow Articles by Bucher, P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Perier, R. C.
Right arrow Articles by Bucher, P.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research Pages 307-309  


The Eukaryotic Promoter Database (EPD): recent developments
Database Description
Recent Developments
   Cross-references to other databases
   Improvement of the access procedures
   SRS support for EPD
Access
Acknowledgement
References


The Eukaryotic Promoter Database (EPD): recent developments

The Eukaryotic Promoter Database (EPD): recent developments

Rouaïda Cavin Périer, Thomas Junier, Claude Bonnard and Philipp Bucher*

Swiss Institute of Bioinformatics & Swiss Institute for Experimental Cancer Research, Ch. des Boveresses 155, 1066-Epalinges s/Lausanne, Switzerland

Received October 8, 1998; Revised October 13, 1998; Accepted October 22, 1998

ABSTRACT

The Eukaryotic Promoter Database (EPD) is an annotated non-redundant collection of eukaryotic POL II promoters, for which the transcription start site has been determined experimentally. Access to promoter sequences is provided by pointers to positions in nucleotide sequence entries. The annotation part of an entry includes description of the initiation site mapping data, cross-references to other databases, and bibliographic references. EPD is structured in a way that facilitates dynamic extraction of biologically meaningful promoter subsets for comparative sequence analysis. Recent efforts have focused on exhaustive cross-referencing to the EMBL nucleotide sequence database, and on the improvement of the WWW-based user interfaces and data retrieval mechanisms. EPD can be accessed at http://www.epd.isb-sib.ch

DATABASE DESCRIPTION

EPD is a database of gene function which keeps track of experimental evidence defining the initiation sites of eukaryotic RNA POL II genes. This information is linked to promoter sequence data via machine readable pointers to corresponding positions in entries of the EMBL nucleotide sequence database (1). Note that EPD does not provide information on promoters in the sense of genetically defined transcription regulatory elements. Such information can be found in TRANSFAC and COMPEL (2) and other databases described in this issue.

EPD was originally designed as a resource for comparative sequence analysis and as such has played an instrumental role in the development of eukaryotic promoter prediction algorithms (3). Recently, its scope has been expanded to meet the requirements of the TRADAT project (see http://www.itba.mi.cnr.it/tradat/ ), a European consortium effort to develop integrated tools for the interpretation of genomic DNA sequences with emphasis on regulatory regions. The extensions comprise many cross-references to data collections covering other aspects of genes and promoters.

EPD is a rigorously selected, curated, and quality-controlled database. In order to be included, a promoter must fulfill a number of conditions laid down in the user manual. For instance, its transcription start site must be mapped with a certain accuracy and certainty, the corresponding gene must be functional, and corresponding sequence data must be available in the public databases. EPD is further confined to promoters recognized by the RNA POL II systems of higher eukaryotes, excluding fungi, algae and protists. However, since promoters are viewed as physiological elements dependent on the correct interpretation by a trans-acting environment, many viral promoters are included and classified with their host species.

A strict non-redundancy policy is applied based on the principle that one entry should correspond to one biological entity. Data from different literature sources pertaining to the same transcription initiation sites are thus always combined in a single entry. Orthologous promoters from different species are linked by internal cross-references, as are alternative promoters of the same gene. All information in EPD originates from a critical examination and independent interpretation of the experimental data presented in the cited research publications. Published conclusions and feature table annotations in EMBL sequence entries are never blindly relied upon.

A more detailed description of the contents and format of EPD has been published in last year's database issue (4).

RECENT DEVELOPMENTS

Cross-references to other databases

In order to facilitate the development of integrated tools within the TRADAT consortium, a concentrated effort has been made to add many new cross-references to related data collections (Table 1). This value-addition was made possible by the major format revision accomplished last year. Breaking with the former tradition that one promoter entry refers to a single EMBL sequence, an initiative was started to systematically cross-reference all EPD entries to all corresponding EMBL entries. The decision to change the former policy was taken when it was realized that many potential links between EPD and TRANSFAC were missed because the two databases referred to different EMBL entries containing the same promoter sequence. With one database comprehensively linked to EMBL, such omissions should no longer occur.

Table 1. Database cross-references in EPD
Database Number of links
EPD internal 162
EMBL (1) 1849
TRANSFAC (2) 1157
SWISS-PROT (7) 929
FLYBASE (8) 127
MIM (9) 222
MGD (10) 44
MEDLINE 2126

Improvement of the access procedures


Figure 1. EPD sequence download form. The user can select promoter subsets from various species and higher order taxonomic groups. If the `Representative set' option at the bottom is actived, a maximal set of representative sequences including no pair with more than 50% sequence identity will be retrieved.


Figure 2. EPD query form and entry access procedures. (a) Query form. The design was inspired by the SRS query form at EBI. In the example shown, the user tries to find EPD promoter entries corresponding to human oncogenes. (b) Query results. The lower part shows the beginning of the list of entries satisfying the specified criteria, from which the user can select those he is interested in. The upper part offers several alternative output options. The selected entries can either be viewed by the browser, or corresponding sequences can be downloaded to a local file. Here, the user attempts to download promoter sequences from the human c-fos and c-myc oncogenes extending from positions -99 to 20 relative to the transcription start site. (c) Contents of the sequence file downloaded by the previous operation. Note that the sequences are in FASTA format as requested by the user.

The WWW-based user interfaces have been improved in several ways. The sequence download page, which, for instance, can be used for retrieving training sets for promoter prediction algorithms, is shown in Figure 1. Note that the sequence range relative to the transcription initiation site is totally flexible because the sequences are extracted from the corresponding EMBL entries on the fly. The revised EPD query form (Fig. 2) now supports string searches with wildcard characters. The following options have been implemented for field restricted searches: All Text, ID, AccNumber, Description, Organism, Authors, Title, Citation, Homology group number, FLYBASE number, MIM number, TRANSFAC ID, SWISS-PROT ID, EMBL ID. Furthermore, the query results can newly be used for retrieving promoter sequences using the same mechanism as the sequence download page.

SRS support for EPD

The existing Icarus configuration files used by SRS version 5 (5) were extensively modified in order to exploit the features of the new EPD format introduced last year. The new versions of these scripts are available from the ftp address given below (files epd.i, epd.is and epd.it). The SRS indexing system takes advantage of many of the recently introduced new fields, especially the cross-references to other databases.

ACCESS

EPD is distributed and maintained as a single ASCII flat file which can be obtained via anonymous ftp from ftp.epd.isb-sib.ch/pub/databases/epd. The following additional files are available:

(i) Sequence containing views in EMBL and FASTA format. These files contain promoter sequences in a range from -499 to +100 relative to the transcription start site plus excerpts from the promoter annotation.

(ii) A slightly reduced version of EPD in ASN.1 format designed for import into the GenBank-Entrez data environment (6).

(iii) Documentation files including the EPD user manual and a formal data description of the ASN.1 version.

(iv) Icarus scripts for indexing EPD by SRS.

The URL for online access to EPD is: http://www.epd.isb-sib.ch . This site offers the following services:

(i) Access to EPD entries by ID or accession number. The following formats are offered: text only, HTML, and HTML combined with a graphic representation of sequence objects by a Java applet (Junier & Bucher 1998, http://www.bioinfo.de/isb/1998/01/0003/ ).

(ii) A page for downloading promoter sequence subsets defined in EPD (for instance all human promoters from 100 bases upstream to 100 bases downstream of the initiation site).

(iii) Access to EPD entries or corresponding promoter sequences via a query form allowing for field-restricted character string searches.

SRS access to EPD is available at the Swiss EMBNet node: http://www.ch.embnet.org/

ACKNOWLEDGEMENT

EPD is funded in part by grant 95.0236-1 from the Swiss Federal Office for Education and Research.

REFERENCES

1. Stoesser,G., Moseley,M.A., Sleep,J., McGowran,M., Garcia-Pastor,M. and Sterk P. (1998) Nucleic Acids Res., 26, 8-15. MEDLINE Abstract

2. Heinemeyer,T., Chen,X., Karas,H., Kel,A.E., Kel,O.V., Liebich,I., Meinhardt,T., Reuter,I., Schacherer,F. and Wingender,E. (1999) Nucleic Acids Res., 27, 318-322.

3. Fickett,J.W. and Hatzigeorgiou,A.G. (1997) Genome Res., 7, 861-878. MEDLINE Abstract

4. Cavin-Périer,R., Junier,T. and Bucher,P. (1998) Nucleic Acids Res., 26, 353-357.

5. Etzold,T., Ulyanov,A. and Argos,P. (1996) Methods Enzymol., 266, 114-128. MEDLINE Abstract

6. Benson,D.A., Boguski,M., Lipman,D.J. and Ostell,J. (1994) Nucleic Acids Res., 22, 3441-3444. MEDLINE Abstract

7. Bairoch,A. and Apweiler,R. (1997) Nucleic Acids Res., 25, 31-36. MEDLINE Abstract

8. Gelbart,W.M., Crosby,M., Matthews,B., Rindone,W.P., Chillemi,J., Russo Twombly,S., Emmert,D., Ashburner,M., Drysdale,R.A., Whitfield,E., Millburn,G.H., de Grey,A., Kaufman,T., Matthews,K., Gilbert,D., Strelets,V. and Tolstoshev,C. (1997) Nucleic Acids Res., 25, 63-66. MEDLINE Abstract

9. Pearson,P., Francomano,C., Foster,P., Bocchini,C., Li,P. and McKusick,V. (1994) Nucleic Acids Res., 22, 3470-3473. MEDLINE Abstract

10. Blake,J.A., Richardson,J.E., Davisson,M.A. T, Eppig,J.T. and the Mouse Genome Informatics Group (1997) Nucleic Acids Res., 25, 85-91. MEDLINE Abstract


*To whom correspondence should be addressed. Tel: +41 21 692 5892; Fax: +41 21 652 6933; Email: philipp.bucher@isrec.unil.ch


This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 9 Dec 1998
Copyright©Oxford University Press, 1998.

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
C. D. Schmid, R. Perier, V. Praz, and P. Bucher
EPD in its twentieth year: towards complete promoter coverage of selected model organisms
Nucleic Acids Res., January 1, 2006; 34(suppl_1): D82 - D85.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. D. Schmid, V. Praz, M. Delorenzi, R. Perier, and P. Bucher
The Eukaryotic Promoter Database EPD: the impact of in silico primer extension
Nucleic Acids Res., January 1, 2004; 32(90001): D82 - 85.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
G. Grillo, F. Licciulli, S. Liuni, E. Sbisa, and G. Pesole
PatSearch: a program for the detection of patterns and structural motifs in nucleotide sequences
Nucleic Acids Res., July 1, 2003; 31(13): 3608 - 3612.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
I. Liebich, J. Bode, I. Reuter, and E. Wingender
Evaluation of sequence motifs found in scaffold/matrix-attached regions (S/MARs)
Nucleic Acids Res., August 1, 2002; 30(15): 3433 - 3442.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
C. Abrescia, E. De Gregorio, M. Frontini, R. Mantovani, and P. Di Nocera
A Novel Intragenic Sequence Enhances Initiator-dependent Transcription in Human Embryonic Kidney 293 Cells
J. Biol. Chem., May 24, 2002; 277(22): 19594 - 19599.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. Praz, R. Perier, C. Bonnard, and P. Bucher
The Eukaryotic Promoter Database, EPD: new entry types and links to gene expression data
Nucleic Acids Res., January 1, 2002; 30(1): 322 - 324.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
N. Elrouby and T. E. Bureau
Molecular Characterization of the Abp1 5'-Flanking Region in Maize and the Teosintes
Plant Physiology, September 1, 2000; 124(1): 369 - 378.
[Abstract] [Full Text]


Home page
J. Biol. Chem.Home page
A. Knutson, E. Castano, T. Oelgeschlager, R. G. Roeder, and G. Westin
Downstream Promoter Sequences Facilitate the Formation of a Specific Transcription Factor IID-Promoter Complex Topology Required for Efficient Transcription from the Megalin/Low Density Lipoprotein Receptor-related Protein 2 Promoter
J. Biol. Chem., May 5, 2000; 275(19): 14190 - 14197.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
M. G. Reese, G. Hartzell, N. L. Harris, U. Ohler, J. F. Abril, and S. E. Lewis
Genome Annotation Assessment in Drosophila melanogaster
Genome Res., April 1, 2000; 10(4): 483 - 501.
[Abstract] [Full Text]


Home page
Nucleic Acids ResHome page
N. J. Schisler and J. D. Palmer
The IDB and IEDB: intron sequence and evolution databases
Nucleic Acids Res., January 1, 2000; 28(1): 181 - 184.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. C. Perier, V. Praz, T. Junier, C. Bonnard, and P. Bucher
The Eukaryotic Promoter Database (EPD)
Nucleic Acids Res., January 1, 2000; 28(1): 302 - 303.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Ghosh
Object-oriented Transcription Factors Database (ooTFD)
Nucleic Acids Res., January 1, 2000; 28(1): 308 - 310.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
O. V. Kel-Margoulis, A. G. Romashchenko, N. A. Kolchanov, E. Wingender, and A. E. Kel
COMPEL: a database on composite regulatory elements providing combinatorial transcriptional regulation
Nucleic Acids Res., January 1, 2000; 28(1): 311 - 315.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (140K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (23)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Perier, R. C.
Right arrow Articles by Bucher, P.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Perier, R. C.
Right arrow Articles by Bucher, P.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?