| Nucleic Acids Research | Pages |
DBcat: a catalog of biological databases
Introduction
Organization
Data Acquisition
Access
Conclusion
Acknowledgements
References
DBcat: a catalog of biological databases
ABSTRACT
INTRODUCTION
As a result of scientific, historical and political factors, the data in biology are disseminated in dozens of independent databases. Moreover, these databases have heterogeneous formats and contents. To assist researchers in identifying the resources pertinent to their information needs, we developed the DBcat, a comprehensive catalog of biological databases. The DBcat is also used on a daily basis by the GIS Infobiogen to fulfill its mission of offering resources for French research in biology. It gives material for (i) carrying on technology watch and systematic analysis of the sources of biological data, (ii) providing efficient access to the data and (iii) performing daily mirroring of the more important databases.
ORGANIZATION
The DBcat is structured as a flat file library. Each entry contains the description of a database with different fields to record the various types of data, e.g., the name (NAME), the database domain in controlled vocabulary (DOMAIN), the description (DESCRIPTION), the names of the different authors and Email contacts (AUTHOR, CONTACT), bibliographic references (RA, RT, RL), site where the database is produced (ORIGINAL-SITE), Email address or URL to submit an entry to the database (SUBMIT), a list of Web (or anonymous ftp) sites from where the database can be queried or retrieved (URL-FTP, URL-WWW). As an illustration, the entry corresponding to the DBcat database itself is reproduced in Figure
Figure 1. A DBcat entry from the DBcat database. The DBcat is an on-going effort that originated in 1994 when Généthon built, for in-house use, a list of programs and data sources available via the Internet. This list gave birth to the BioCatalog (a software directory of general interest in molecular biology and genetics, maintained at the EBI, http://www.ebi.ac.uk/biocat/ ) (1) and the DBcat. The latter is now produced at GIS Infobiogen. New databases are searched in the Web, either by means of general purpose Web search engines or biology oriented Web sites. Journals, such as the Nucleic Acids Research Database Issue, are also consulted. The producers of the database are asked by Email to complete a form. There is also a Web form to submit a new database entry to the DBcat (http://www.infobiogen.fr/services/dbcat/file/dbcat_form.html ). If the author has validated the entry corresponding to its database, it is marked as CHECKED. The DBcat contains more than 400 database entries, available in one flat file. To reflect the areas of interest of the users, the database entries are also grouped into eight application domains: DNA, RNA, Protein, Genomics, Mapping, Protein structure, Literature and Miscellaneous. The number of databases listed in each domains is given in Table 1. Table 1. DBcat grouped by application domains
The DBcat provides the users with a variety of modes of access: (i) download the flat files: ftp://ftp.infobiogen.fr/pub/db/dbcat ; (ii) Web interface homepage with a simple query by name interface: http://www.infobiogen.fr/services/dbcat/ ; (iii) SRS servers: there are a number of SRS sites indexing the DBcat catalog, e.g., GIS Infobiogen (http://www.infobiogen.fr/srs/ ) in France or Seqnet (http://www.seqnet.dl.ac.uk:80/srs5/) in the UK. We plan to introduce two fields for better characterization of a DBcat entry: (i) DR: a list of databases having cross-references to the database described by this entry; (ii) Access: a list of the interfaces available for accessing the data, e.g., SQL, ORB, SRS. The idea is to facilitate the work of researchers that, simultaneously, needs to interact with a number of heterogeneous databases; typically, bioinformatics people working on integration and data interoperation (2-4). We strongly encourage the submission of new data to the DBcat, as well as updates. Feedback is crucial to the success of the DBcat. The authors wish to thank Patricia Rodriguez-Tomé and the GREG for supporting the first version of the DBcat. We also wish to thank the authors/curators of the 408 databases listed in this catalog for providing their work and expertise to the community.
DATA ACQUISITION
ACCESS
Domain
No. of records
DNA
60
RNA
21
Protein
74
Genomic
57
Mapping
28
Protein structure
18
Literature
37
Miscellaneous
113
Total
408
CONCLUSION
ACKNOWLEDGEMENTS
REFERENCES
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 9 Dec 1998
Copyright©Oxford University Press, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This Article ![]()
![]()
Abstract
![]()
Print PDF (35K)
![]()
Alert me when this article is cited
![]()
Alert me if a correction is posted
![]()
Services ![]()
![]()
Email this article to a friend
![]()
Similar articles in this journal
![]()
Similar articles in ISI Web of Science
![]()
Similar articles in PubMed
![]()
Alert me to new issues of the journal
![]()
Add to My Personal Archive
![]()
Download to citation manager
![]()
Search for citing articles in:
ISI Web of Science (6)
![]()
Request Permissions ![]()
Commercial Re-use Guidelines
for Open Access NAR Content
![]()
Google Scholar ![]()
![]()
Articles by Discala, C.
![]()
Articles by Vaysseix, G.
![]()
Search for Related Content
![]()
PubMed ![]()
![]()
PubMed Citation
![]()
Articles by Discala, C.
![]()
Articles by Vaysseix, G.
![]()
Social Bookmarking ![]()
![]()
What's this?