Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (143K) Freely available
Right arrow Database Listing
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (24)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Baxevanis, A. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Baxevanis, A. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2002, Vol. 30, No. 1 1-12
© 2002 Oxford University Press

The Molecular Biology Database Collection: 2002 update

Andreas D. Baxevanis*

Genome Technology Branch, National Human Genome Research Institute, National Institutes of Health, Building 50, Room 5222, Bethesda, MD 20892-8002, USA

Received October 9, 2001; Accepted November 20, 2001.


    ABSTRACT
 TOP
 ABSTRACT
 REFERENCES
 
The Molecular Biology Database Collection is an online resource listing key databases of value to the biological community. This Collection is intended to bring fellow scientists’ attention to high-quality databases that are available throughout the world, rather than just be a lengthy listing of all available databases. As such, this up-to-date listing is intended to serve as the initial point from which to find specialized databases that may be of use in biological research. The databases included in this Collection provide new value to the underlying data by virtue of curation, new data connections or other innovative approaches. Short, searchable summaries and updates for each of the databases included in the Collection are available through the Nucleic Acids Research Web site at http://nar.oupjournals.org.

One of the most significant scientific events in the year 2001 was the publication of the initial sequence and analysis of the human genome resulting from both public (1) and private sector (2) efforts. With these publications, we have entered into a new era for modern biology, one where the majority of biological and biomedical research being conducted will use sequence data as its basic underpinning. Having such a rich source of information will prove invaluable for basic researchers whose findings will, in time, lead to improved strategies for the diagnosis, treatment and prevention of diseases having a genetic basis. In short, the stage has been set for genetic medicine having a prominent role in the delivery of healthcare in the future (3).

A number of significant insights have already been made into the secrets hidden within the 3 billion bases that comprise the human genome (1). There is marked variation in the distribution of features such as genes, transposable elements, GC content, CpG islands and recombination rate; this uneven distribution may provide important clues about the functions of these features and how they may be involved in regulation. There is a preferential retention of Alu elements in GC-rich regions, correlating them (in a loose sense) with actively-transcribed genes. These elements may actually turn out to not be just ‘junk DNA’, instead providing a tangible benefit to their human hosts. In general, repetitive elements may not have a direct function per se, but may influence chromosome structure. Probably the most telling finding is that the total number of genes in the human genome is only in the order of 30 000 to 35 000. Previously, numbers in the 80 000 range (and as high as 140 000) had been put forward. While the new estimate in the number of genes gives the human about twice that seen in Caenorhabditis elegans or in Drosophila, the genes themselves have a more complex structure. This big down-estimate in the number of genes immediately brings into question the one gene–one protein hypothesis: we are now finding more and more examples of alternative splicing generating a larger number of protein products (consistent with a more complex gene structure), as well as cases where identical proteins can be used for different functions, depending on their compartmentalization (4).

While the near-completion of human genome sequencing marks a significant milestone, there are many other sequence-based efforts currently underway that will have just as much impact on the scientific and medical community. The most eagerly-anticipated model organism map is that of the mouse. The most recent physical map released on the Ensembl web site (http://mouse.ensembl.org, September 2001) provides an estimated 95% coverage of the mouse genome, with 15 694 genes confirmed over 361 Mb. To the issue of human health, single nucleotide polymorphisms (SNPs) continue to be identified at a breakneck pace. Over 1 million SNPs have already been identified, and a random sampling chosen for validation shows that 95% of these are indeed both polymorphic and unique (http://snp.cshl.org/data/). SNP alleles can be used as genetic markers, and often, the SNP itself is the variant that causes or contributes to the risk of developing a particular genetic disorder. To increase the power of using SNPs as markers for human disease, efforts are currently under way to develop a haplotype map, where ‘blocks’ of SNPs (rather than individual SNPs) could be used to find chromosomal regions associated with disease.

The sequence data that has been generated by these and other systematic sequencing projects can be browsed and downloaded from a variety of Web sites, with the major portals being located at NCBI (http://www.ncbi.nlm.nih.gov), Ensembl (http://www.ensembl.org) and UCSC (http://genome.cse.ucsc.edu). The problem that many investigators encounter, however, is that these larger databases often do not contain specialized information that would be of interest to specific groups within the scientific community. Many such databases have emerged to fill the void, and these databases often provide not just sequence-based information, but data such as phenotypes, experimental conditions, strain crosses and map features, data that might not fit neatly onto a large physical map of a genome. Most importantly, data in these smaller databases tend to be curated by experts in a particular speciality and are often experimentally-verified, meaning that they represent the best state of knowledge in that particular area. The savvy user will, therefore, make use of both types of databases in their experimental planning and design. This journal has devoted its first issue over the last several years to documenting the availability and features of these specialized databases in order to better-serve its readership and to promote the use of these resources in the design and analysis of experiments. These reviewed databases are collectively listed in the Molecular Biology Database Collection.

The databases included in the current version of the Collection are shown in Table 1. This year, the total number of databases listed is 335, up from 281 the year before. Several new databases have been added to the Collection, while others that are no longer actively curated or no longer available have been removed. These databases all distinguish themselves by their approach to presenting the underlying data—for example, by adding new value to the underlying data by virtue of curation, by providing new types of data connections or by implementing other innovative approaches that facilitate biological discovery. The individual entries are classified by type, but the reader should recognize that the distinctions between these classes are often arbitrary, and that many of these databases provide more than one type of information to the user.


View this table:
[in this window]
[in a new window]
 
Table 1. Molecular Biology Database Collection
 
In addition to the list presented in this paper, an electronic version of the Database Issue and Collection can be accessed online and is freely available to everyone, regardless of subscription status, at http://nar.oupjournals.org. While the list contains the databases described in the papers comprising the current issue, it should be immediately apparent to the reader that there are simply not enough pages in this journal to accommodate full-length, printed descriptions of all of the 335 databases featured here. To address this, the online version of the Collection now includes short summaries of many of the databases, the summaries having been provided directly by the investigators responsible for the individual databases. We have also asked contributors to point out new features of their databases in the Recent Developments section of their entry. It is hoped that this approach will provide the reader with an additional source of information that will facilitate finding and selecting the sources of data that would be of most value in addressing a specific biological problem. Contributors will be encouraged to keep their entries up-to-date.

Suggestions for the inclusion of additional database resources in this collection are encouraged and may be directed to the author (andy{at}nhgri.nih.gov).


    ACKNOWLEDGEMENT
 
I wish to thank Yi-Chi Barash for designing the new Web-based submission tool for this Collection, as well as for her technical support.


    FOOTNOTES
 
* Tel: +1 301 496 8570; Fax: +1 301 402 6858; Email: andy{at}nhgri.nih.gov Back


    REFERENCES
 TOP
 ABSTRACT
 REFERENCES
 

    1 International Human Genome Sequencing Consortium (2001) Initial sequencing and analysis of the human genome. Nature, 409, 860–921.[Medline]

    2 Venter,J.C., Adams,M.D., Myers,E.W., Li,P.W., Mural,R.J., Sutton,G.G., Smith,H.O., Yandell,M., Evans,C.A., Holt,R.A. et al. (2001) The sequence of the human genome. Science, 291, 1304–1351.[Abstract/Free Full Text]

    3 Collins,F.S. and McKusick,V.A. (2001) Implications of the Human Genome Project for medical science. J. Am. Med. Assoc., 285, 540–544.[Abstract/Free Full Text]

    4 Jeffery,C.J. (1999) Moonlighting proteins. Trends Biochem Sci., 24, 8–11.[Web of Science][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
V. Jayaswal, M. Lutherborrow, D. D. F. Ma, and Y. Hwa Yang
Identification of microRNAs with regulatory potential using a matched microRNA-mRNA time-course data
Nucleic Acids Res., May 1, 2009; 37(8): e60 - e60.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Physiol. Renal Physiol.Home page
H. Fodstad, E. Gonzalez-Rodriguez, S. Bron, H. Gaeggeler, B. Guisan, B. C. Rossier, and J.-D. Horisberger
Effects of mineralocorticoid and K+ concentration on K+ secretion and ROMK channel expression in a mouse cortical collecting duct cell line
Am J Physiol Renal Physiol, May 1, 2009; 296(5): F966 - F975.
[Abstract] [Full Text] [PDF]


Home page
Stem CellsHome page
T.-L. Hackett, F. Shaheen, A. Johnson, S. Wadsworth, D. V. Pechkovsky, D. B. Jacoby, A. Kicic, S. M. Stick, and D. A. Knight
Characterization of Side Population Cells from Human Airway Epithelium
Stem Cells, October 1, 2008; 26(10): 2576 - 2585.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Schwede, L. Ellis, J. Luther, M. Carrington, G. Stoecklin, and C. Clayton
A role for Caf1 in mRNA deadenylation and decay in trypanosomes and human cells
Nucleic Acids Res., June 1, 2008; 36(10): 3374 - 3388.
[Abstract] [Full Text] [PDF]


Home page
BiostatisticsHome page
E. Garrett-Mayer, G. Parmigiani, X. Zhong, L. Cope, and E. Gabrielson
Cross-study validation and combined analysis of gene expression microarray data
Biostat., April 1, 2008; 9(2): 333 - 354.
[Abstract] [Full Text] [PDF]


Home page
Biol. Reprod.Home page
E. De La Chesnaye, B. Kerr, A. Paredes, H. Merchant-Larios, J. P. Mendez, and S. R. Ojeda
Fbxw15/Fbxo12J Is an F-Box Protein-Encoding Gene Selectively Expressed in Oocytes of the Mouse Ovary
Biol Reprod, April 1, 2008; 78(4): 714 - 725.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
J. R. Haanstra, M. Stewart, V.-D. Luu, A. van Tuijl, H. V. Westerhoff, C. Clayton, and B. M. Bakker
Control and Regulation of Gene Expression: QUANTITATIVE ANALYSIS OF THE EXPRESSION OF PHOSPHOGLYCERATE KINASE IN BLOODSTREAM FORM TRYPANOSOMA BRUCEI
J. Biol. Chem., February 1, 2008; 283(5): 2495 - 2507.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
R. Soundararajan, A. D. Wishart, H. P. V. Rupasinghe, M. Arcellana-Panlilio, C. M. Nelson, M. Mayne, and G. S. Robertson
Quercetin 3-Glucoside Protects Neuroblastoma (SH-SY5Y) Cells in Vitro against Oxidative Damage by Inducing Sterol Regulatory Element-binding Protein-2-mediated Cholesterol Biosynthesis
J. Biol. Chem., January 25, 2008; 283(4): 2231 - 2245.
[Abstract] [Full Text] [PDF]


Home page
Eukaryot CellHome page
C. Hartmann, C. Benz, S. Brems, L. Ellis, V.-D. Luu, M. Stewart, I. D'Orso, C. Busold, K. Fellenberg, A. C. C. Frasch, et al.
Small Trypanosome RNA-Binding Proteins TbUBP1 and TbUBP2 Influence Expression of F-Box Protein mRNAs in Bloodstream Trypanosomes
Eukaryot. Cell, November 1, 2007; 6(11): 1964 - 1978.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Fan and Y. Niu
Selection and validation of normalization methods for c-DNA microarrays using within-array replications
Bioinformatics, September 15, 2007; 23(18): 2391 - 2398.
[Abstract] [Full Text] [PDF]


Home page
BloodHome page
T. Raz, V. Nardi, M. Azam, J. Cortes, and G. Q. Daley
Farnesyl transferase inhibitor resistance probed by target mutagenesis
Blood, September 15, 2007; 110(6): 2102 - 2109.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Wagner, C. Lewis, and M. Bichsel
A survey of bacterial insertion sequences using IScan
Nucleic Acids Res., August 13, 2007; 35(16): 5284 - 5293.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
C.-H. Li, H. Irmer, D. Gudjonsdottir-Planck, S. Freese, H. Salm, S. Haile, A. M. Estevez, and C. Clayton
Roles of a Trypanosoma brucei 5'->3' exoribonuclease homolog in mRNA degradation
RNA, December 1, 2006; 12(12): 2171 - 2186.
[Abstract] [Full Text] [PDF]


Home page
BiostatisticsHome page
R. L. Prentice and L. QI
Aspects of the design and analysis of high-dimensional SNP studies for disease risk estimation
Biostat., July 1, 2006; 7(3): 339 - 354.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Physiol. Lung Cell. Mol. Physiol.Home page
K. C. Day, C. G. Plopper, and M. V. Fanucchi
Age-specific pulmonary cytochrome P-450 3A1 expression in postnatal and adult rats
Am J Physiol Lung Cell Mol Physiol, July 1, 2006; 291(1): L75 - L83.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
K. P. Scott, J. C. Martin, G. Campbell, C.-D. Mayer, and H. J. Flint
Whole-Genome Transcription Profiling Reveals Genes Up-Regulated by Growth on Fucose in the Human Gut Bacterium "Roseburia inulinivorans".
J. Bacteriol., June 1, 2006; 188(12): 4340 - 4349.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
A. Wagner
Periodic Extinctions of Transposable Elements in Bacterial Lineages: Evidence from Intragenomic Variation in Multiple Genomes
Mol. Biol. Evol., April 1, 2006; 23(4): 723 - 733.
[Abstract] [Full Text] [PDF]


Home page
Proc R Soc BHome page
F. Seebacher, T. S Schwartz, and M. B Thompson
Transition from ectothermy to endothermy: the development of metabolic capacity in a bird (Gallus gallus)
Proc R Soc B, March 7, 2006; 273(1586): 565 - 570.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Webb, R. Burns, L. Ellis, N. Kimblin, and M. Carrington
Developmentally regulated instability of the GPI-PLC mRNA is dependent on a short-lived protein factor
Nucleic Acids Res., March 8, 2005; 33(5): 1503 - 1512.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Kohrer, E. L. Sullivan, and U. L. RajBhandary
Complete set of orthogonal 21st aminoacyl-tRNA synthetase-amber, ochre and opal suppressor tRNA pairs: concomitant suppression of three different termination codons in an mRNA in mammalian cells
Nucleic Acids Res., December 1, 2004; 32(21): 6200 - 6211.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
J.-Y. Lu and R. J. Schneider
Tissue Distribution of AU-rich mRNA-binding Proteins Involved in Regulation of mRNA Decay
J. Biol. Chem., March 26, 2004; 279(13): 12974 - 12979.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
N. M. Sayer, M. Cubin, A. Rhie, M. Bullock, A. Tahiri-Alaoui, and W. James
Structural Determinants of Conformationally Selective, Prion-binding Aptamers
J. Biol. Chem., March 26, 2004; 279(13): 13102 - 13109.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
H. Xiong, C. Zhu, F. Li, R. Hegazi, K. He, M. Babyatsky, A. J. Bauer, and S. E. Plevy
Inhibition of Interleukin-12 p40 Transcription and NF-{kappa}B Activation by Nitric Oxide in Murine Macrophages and Dendritic Cells
J. Biol. Chem., March 12, 2004; 279(11): 10776 - 10783.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. Fan, P. Tam, G. V. Woude, and Y. Ren
Normalization and analysis of cDNA microarrays using within-array replications applied to neuroblastoma cell response to a cytokine
PNAS, February 3, 2004; 101(5): 1135 - 1140.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Wang, A. R. Perrault, Y. Takeda, W. Qin, H. Wang, and G. Iliakis
Biochemical evidence for Ku-independent backup pathways of NHEJ
Nucleic Acids Res., September 15, 2003; 31(18): 5377 - 5388.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
B. Sarkar, J.-Y. Lu, and R. J. Schneider
Nuclear Import and Export Functions in the Different Isoforms of the AUF1/Heterogeneous Nuclear Ribonucleoprotein Protein Family
J. Biol. Chem., May 30, 2003; 278(23): 20700 - 20707.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
C. Ton, D. Stamatiou, and C.-C. Liew
Gene expression profile of zebrafish exposed to hypoxia during development
Physiol Genomics, April 16, 2003; 13(2): 97 - 106.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
P. L. Splinter, A. I. Masyuk, and N. F. LaRusso
Specific Inhibition of AQP1 Water Channels in Isolated Rat Intrahepatic Bile Duct Units by Small Interfering RNAs
J. Biol. Chem., February 14, 2003; 278(8): 6268 - 6274.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
P. Bundock and P. Hooykaas
Severe Developmental Defects, Hypersensitivity to DNA-Damaging Agents, and Lengthened Telomeres in Arabidopsis MRE11 Mutants
PLANT CELL, October 1, 2002; 14(10): 2451 - 2462.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
J.-J. Hwang, P. D. Allen, G. C. Tseng, C.-W. Lam, L. Fananapazir, V. J. Dzau, and C.-C. Liew
Microarray gene expression profiles in dilated and hypertrophic cardiomyopathic end-stage heart failure
Physiol Genomics, July 12, 2002; 10(1): 31 - 44.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (143K) Freely available
Right arrow Database Listing
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (24)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Baxevanis, A. D.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Baxevanis, A. D.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?