Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (35K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (6)
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Discala, C.
Right arrow Articles by Vaysseix, G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Discala, C.
Right arrow Articles by Vaysseix, G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research Pages 10-11  


DBcat: a catalog of biological databases
Introduction
Organization
Data Acquisition
Access
Conclusion
Acknowledgements
References


DBcat: a catalog of biological databases

DBcat: a catalog of biological databases

Claude Discala1,2, Marion Ninnin1, Frédéric Achard1, Emmanuel Barillot1,3,* and Guy Vaysseix1,3

1GIS Infobiogen, 7 rue Guy Môquet BP 8, 94801 Villejuif cedex, France, 2CNS, 2 rue Gaston Crémieux BP 191, 91006 Évry cedex, France and 3Généthon, 1 rue de l'Internationale BP 60, 91002 Évry cedex, France

Received August 31, 1998; Accepted October 13, 1998

ABSTRACT

The DBcat (http://www.infobiogen.fr/services/dbcat) is a comprehensive catalog of biological databases, maintained and curated on a daily basis at GIS Infobiogen. It contains more than 400 databases classified by application domains. The DBcat is a structured flat file library, that can be searched by means of an SRS server or a dedicated Web interface. The files are available for downloading from Infobiogen anonymous ftp server.

INTRODUCTION

As a result of scientific, historical and political factors, the data in biology are disseminated in dozens of independent databases. Moreover, these databases have heterogeneous formats and contents. To assist researchers in identifying the resources pertinent to their information needs, we developed the DBcat, a comprehensive catalog of biological databases. The DBcat is also used on a daily basis by the GIS Infobiogen to fulfill its mission of offering resources for French research in biology. It gives material for (i) carrying on technology watch and systematic analysis of the sources of biological data, (ii) providing efficient access to the data and (iii) performing daily mirroring of the more important databases.

ORGANIZATION

The DBcat is structured as a flat file library. Each entry contains the description of a database with different fields to record the various types of data, e.g., the name (NAME), the database domain in controlled vocabulary (DOMAIN), the description (DESCRIPTION), the names of the different authors and Email contacts (AUTHOR, CONTACT), bibliographic references (RA, RT, RL), site where the database is produced (ORIGINAL-SITE), Email address or URL to submit an entry to the database (SUBMIT), a list of Web (or anonymous ftp) sites from where the database can be queried or retrieved (URL-FTP, URL-WWW). As an illustration, the entry corresponding to the DBcat database itself is reproduced in Figure 1.


Figure 1. A DBcat entry from the DBcat database.

DATA ACQUISITION

The DBcat is an on-going effort that originated in 1994 when Généthon built, for in-house use, a list of programs and data sources available via the Internet. This list gave birth to the BioCatalog (a software directory of general interest in molecular biology and genetics, maintained at the EBI, http://www.ebi.ac.uk/biocat/ ) (1) and the DBcat. The latter is now produced at GIS Infobiogen. New databases are searched in the Web, either by means of general purpose Web search engines or biology oriented Web sites. Journals, such as the Nucleic Acids Research Database Issue, are also consulted. The producers of the database are asked by Email to complete a form. There is also a Web form to submit a new database entry to the DBcat (http://www.infobiogen.fr/services/dbcat/file/dbcat_form.html ). If the author has validated the entry corresponding to its database, it is marked as CHECKED.

ACCESS

The DBcat contains more than 400 database entries, available in one flat file. To reflect the areas of interest of the users, the database entries are also grouped into eight application domains: DNA, RNA, Protein, Genomics, Mapping, Protein structure, Literature and Miscellaneous. The number of databases listed in each domains is given in Table 1.

Table 1. DBcat grouped by application domains
Domain No. of records
DNA 60
RNA 21
Protein 74
Genomic 57
Mapping 28
Protein structure 18
Literature 37
Miscellaneous 113
Total 408

The DBcat provides the users with a variety of modes of access: (i) download the flat files: ftp://ftp.infobiogen.fr/pub/db/dbcat ; (ii) Web interface homepage with a simple query by name interface: http://www.infobiogen.fr/services/dbcat/ ; (iii) SRS servers: there are a number of SRS sites indexing the DBcat catalog, e.g., GIS Infobiogen (http://www.infobiogen.fr/srs/ ) in France or Seqnet (http://www.seqnet.dl.ac.uk:80/srs5/) in the UK.

CONCLUSION

We plan to introduce two fields for better characterization of a DBcat entry: (i) DR: a list of databases having cross-references to the database described by this entry; (ii) Access: a list of the interfaces available for accessing the data, e.g., SQL, ORB, SRS.

The idea is to facilitate the work of researchers that, simultaneously, needs to interact with a number of heterogeneous databases; typically, bioinformatics people working on integration and data interoperation (2-4). We strongly encourage the submission of new data to the DBcat, as well as updates. Feedback is crucial to the success of the DBcat.

ACKNOWLEDGEMENTS

The authors wish to thank Patricia Rodriguez-Tomé and the GREG for supporting the first version of the DBcat. We also wish to thank the authors/curators of the 408 databases listed in this catalog for providing their work and expertise to the community.

REFERENCES

1. Rodriguez-Tome,P. (1998) Bioinformatics, 14, 469-470. MEDLINE Abstract

2. Achard,F., Cussat-Blanc,C., Viara,E. and Barillot,E. (1998) Bioinformatics, 14, 342-348. MEDLINE Abstract

3. Karp,P.D. (1995) J. Computat. Biol., 2, 175-586.

4. Fasman,K.H. (1994) J. Computat. Biol., 1, 165-171.


*To whom correspondence should be addressed at: GIS Infobiogen, 7 rue Guy Môquet BP 8, 94801 Villejuif cedex, France. Tel: +33 1 49 58 36 82; Fax: +33 1 45 59 52 50; Email: manu@infobiogen.fr


This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 9 Dec 1998
Copyright©Oxford University Press, 1998.

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow Print PDF (35K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (6)
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Discala, C.
Right arrow Articles by Vaysseix, G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Discala, C.
Right arrow Articles by Vaysseix, G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?