Nucleic Acids Research, 2004, Vol. 32, Database issue D23-D26
© 2004 Oxford University Press
GenBank: update
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
*To whom correspondence should be addressed. Tel: +1 301 435 5950; Fax: +1 301 480 9241; Email: wheeler{at}ncbi.nlm.nih.gov
Received September 16, 2003; Revised and Accepted September 22, 2003
| ABSTRACT |
|---|
|
|
|---|
GenBank (R) is a comprehensive database that contains publicly available DNA sequences for more than 140 000 named organisms, obtained primarily through submissions from individual laboratories and batch submissions from large-scale sequencing projects. Most submissions are made using the BankIt (web) or Sequin program and accession numbers are assigned by GenBank staff upon receipt. Daily data exchange with the EMBL Data Library in the UK and the DNA Data Bank of Japan helps ensure worldwide coverage. GenBank is accessible through NCBIs retrieval system, Entrez, which integrates data from the major DNA and protein sequence databases along with taxonomy, genome mapping, protein structure and domain information, and the biomedical journal literature via PubMed. BLAST provides sequence similarity searches of GenBank and other sequence databases. Complete bimonthly releases and daily updates of the GenBank database are available by FTP. To access GenBank and its related retrieval and analysis services, go to the NCBI home page at: http://www.ncbi.nlm.nih.gov.
| INTRODUCTION |
|---|
|
|
|---|
GenBank (1) is a comprehensive public database of nucleotide sequences and supporting bibliographic and biological annotation, built and distributed by the National Center for Biotechnology Information (NCBI), a division of the National Library of Medicine (NLM), located on the campus of the US National Institutes of Health (NIH) in Bethesda, MD, USA.
NCBI builds GenBank primarily from the submission of sequence data from authors and from the bulk submission of expressed sequence tag (EST), genome survey sequence (GSS) and other high-throughput data from sequencing centers. The US Office of Patents and Trademarks (USPTO) also contributes sequences from issued patents. GenBank incorporates sequences submitted to the EMBL Data Library (2) in the UK and the DNA Data Bank of Japan (DDBJ) (3) as part of a long-standing international collaboration between the three databases in which data is exchanged daily to ensure a uniform and comprehensive collection of sequence information. NCBI makes the GenBank data available at no cost over the internet, via FTP access and a wide range of web-based retrieval and analysis services which operate on the GenBank data (4).
| ORGANIZATION OF THE DATABASE |
|---|
|
|
|---|
GenBank continues to grow at an exponential rate with 9 million new sequences added over the past 12 months. As of Release 137 in August 2003, GenBank contained over 33.9 billion nucleotide bases from 27.2 million individual sequences. Complete genomes (http://www.ncbi.nlm.nih. gov/Genomes/index.html) represent a growing portion of the database, with over 40 of more than 130 complete microbial genomes in GenBank deposited over the past year. The number of eukaryote genomes for which coverage and assembly are good continues to increase with over a dozen such assemblies now available, including that of the reference human genome.
Sequence-based taxonomy
Database sequences are classified and can be queried using a comprehensive sequence-based taxonomy (http://www.ncbi. nlm.nih.gov/Taxonomy/taxonomyhome.html) developed by NCBI in collaboration with EMBL and DDBJ and with the valuable assistance of external advisors and curators. Over 140 000 species are represented in GenBank and new species are being added at the rate of over 1700 per month. About 26% of the sequences in GenBank are of human origin and 20% of all sequences are human ESTs. After Homo sapiens, the top species in GenBank in terms of number of bases are Mus musculus, Rattus norvegicus, Danio rerio, Oryza sativa, Drosophila melanogaster, Zea mays, Arabidopsis thaliana and Gallus gallus.
GenBank records and divisions
Each GenBank entry includes a concise description of the sequence, the scientific name and taxonomy of the source organism, bibliographic references and a table of features (http://www.ncbi.nlm.nih.gov/collab/FT/index.html) listing areas of biological significance, such as coding regions and their protein translations, transcription units, repeat regions and sites of mutation or modification.
The files in the GenBank distribution have traditionally been divided into divisions that roughly correspond to taxonomic groups such as bacteria (BCT), viruses (VRL), primates (PRI) and rodents (ROD). In recent years, divisions have been added to support specific sequencing strategies. These include divisions for EST, GSS, high-throughput genomic (HTG) and high-throughput cDNA (HTC) sequences, making a total of 17 divisions. For convenience in file transfer, the larger divisions, such as the EST and PRI, are partitioned into multiple files when posting the bimonthly GenBank releases on the NCBIs FTP site.
ESTs
ESTs continue to be the major source of new sequence records and gene sequences. Over the past year the number of ESTs has increased by over 45% to a total of 18.1 million sequences representing over 580 different organisms. The top five organisms represented in the EST division are H.sapiens (5.4 million records), M.musculus (3.8 million records), R.norvegicus (540 000 records), Triticum aestivum (500 000 records) and Ciona intestinalis (490 000 records). As part of its daily processing of GenBank EST data, the NCBI identifies through BLAST searches all homologies for new EST sequences and incorporates that information into the companion database, dbEST (http://www.ncbi.nlm.nih.gov/dbEST/index.html) (5). The data in dbEST is further processed to produce the UniGene database (http://www.ncbi.nlm.nih.gov/UniGene/) of gene-oriented sequence clusters described more fully in (4).
Sequence-tagged sites (STSs) and GSSs
The STS division of GenBank (http://www.ncbi.nlm.nih.gov/dbSTS/index.html) contains over 240 000 sequences including anonymous STSs based on genomic sequence as well as gene-based STSs derived from the 3' ends of genes and ESTs. These STS records usually include primer sequences, annotations and PCR conditions.
The GSS division of GenBank (http://www.ncbi.nlm.nih. gov/dbGSS/index.html) has grown over the past year by 73% to a total of 6.4 million records with over 2.0 billion nucleotides. GSS records represent random genomic sequences, and are predominantly single reads from bacterial artificial chromosomes (BAC-ends) used in a variety of genome sequencing projects. The most highly represented species in the GSS division are Z.mays (1.3 million records), M.musculus (952 000 records), H.sapiens (893 000 records) and Brassica oleracea (595 000 records). Human data have been used (http://www.ncbi.nlm.nih.gov/genome/clone) along with the STS records in tiling the BACs for the Human Genome Project (6).
HTG and HTC sequences
The HTG division of GenBank (http://www.ncbi.nlm.nih.gov/HTGS/) contains unfinished large-scale genomic records that are in transition to a finished state (7). These records are designated as Phase 03 depending on the quality of the data. Upon reaching Phase 3, the finished state, HTG records are moved into the appropriate organism division of GenBank. As of release 137 of GenBank, the HTG division comprised some 12 billion bp of sequence.
The HTC division of GenBank accommodates high-throughput cDNA sequences. HTCs are of draft quality, but may contain 5'-UTRs and 3'-UTRs, partial coding regions and introns. HTC sequences that are finished and of high quality are moved to the appropriate organism GenBank division. GenBank release 137 contained more than 148 000 HTC sequences totaling over 200 million bases. A recent project generating HTC data has been described (8) and other projects are listed at: http://www.ncbi.nlm.nih.gov/genome/flcdna/.
Sequence identifiers and accession numbers
Each GenBank record, consisting of both a sequence and its annotations, is assigned a stable and unique identifier, the accession number, which remains constant over the lifetime of the record even when there is a change to the sequence or annotation. The DNA sequence within a GenBank record is also assigned a unique identifier, called a GI, that appears on the VERSION line of GenBank flatfile records following the accession number. A third identifier of the form Accession.version, also displayed on the VERSION line of flatfile records, consolidates the information present in the GI and accession numbers. An entry appearing in the database for the first time has an Accession.version identifier equivalent to the ACCESSION number of the GenBank record followed by .1 to indicate the first version of the sequence for the record, e.g. ACCESSION AF000001
[GenBank]
VERSION AF000001
[GenBank]
.1 GI: 987654321. When a change is made to a sequence given in a GenBank record, a new GI number is issued to the sequence and the version extension of the Accession.version identifier is incremented. The accession number for the record as a whole remains unchanged and the older sequence remains available under the old Accession.version identifier and GI.
A similar system tracks changes in the corresponding protein translations using Accession.version identifiers comprised of a protein accession number, e.g. AAA00001, followed by a version number. These identifiers appear as qualifiers for CDS features in the FEATURES table portion of a GenBank entry, e.g. /protein_id=AAA00001.1 Protein sequence translations also receive their own unique GI number, which appears as a second qualifier on the CDS feature: /db_xref=GI:1233445.
| BUILDING THE DATABASE |
|---|
|
|
|---|
The data in GenBank, and the collaborating databases EMBL and DDBJ, are submitted primarily by individual authors to one of the three databases, or by sequencing centers as batches of ESTs, STSs, GSSs, HTCs or HTGs. Data are exchanged daily with DDBJ and EMBL so that the daily updates from NCBI servers incorporate the most recently available sequence data from all sources.
Direct submission
Virtually all records enter GenBank as direct electronic submissions (http://www.ncbi.nlm.nih.gov/Genbank/index. html), with the majority of authors using the BankIt or Sequin program. Many journals require authors with sequence data to submit the data to a public database as a condition of publication.
GenBank staff can usually assign an accession number to a sequence submission within two working days of receipt, and do so at a rate of almost 700 per day. The accession number serves as confirmation that the sequence has been submitted and allows readers of articles in which the sequence is cited to retrieve the relevant data. Direct submissions receive a quality assurance review that includes checks for vector contamination, proper translation of coding regions, correct taxonomy and correct bibliographic citations. A draft of the GenBank record is passed back to the author for review before it enters the database. Authors may ask that their sequences be kept confidential until the time of publication. Since GenBank policy requires that deposited sequence data be made public when the sequence or accession number is published, authors are instructed to inform GenBank staff of the publication date of the article in which the sequence is cited in order to ensure timely release of the data. Although only the submitting scientist is permitted to modify sequence data or annotations, all users are encouraged to report lags in releasing data or possible errors or omissions to GenBank at update{at}ncbi.nlm.nih.gov.
The NCBI works closely with sequencing centers to ensure timely incorporation of bulk data into GenBank for public release. GenBank offers special batch procedures for large-scale sequencing groups to facilitate data submission, including the program fa2htgs and other tools (9).
Third party annotation
Third party annotation (TPA) refers to the annotation by third party authors of nucleotide sequences derived or assembled from public primary sequence data found in the DDBJ/EMBL/GenBank International Nucleotide Sequence Collaboration Databases. Examples of TPA submissions include an mRNA sequence assembled from overlapping ESTs, or the annotation of exons, introns and coding regions on an unannotated genomic sequence. Trace data sequences or whole genome shotgun (WGS) sequences in DDBJ/EMBL/GenBank may also be used as the basis of a TPA submission, but data from secondary sources such as NCBI Reference sequences, or primary data from proprietary databases may not be used.
The format of a TPA record (e.g. BK000016 [GenBank] ) is similar to that of a conventional GenBank record but includes the label TPA: at the beginning of each definition line and the keywords Third Party Annotation; TPA in the Keywords field. The Comment field of TPA records lists all primary sequences used to assemble the TPA sequence; the Primary field provides the base ranges of the primary sequences that contribute to the TPA sequence.
TPA submissions to GenBank may be made using either BankIt or Sequin, but TPA sequences are not released to the public until their accession numbers or sequence data and annotation appear in a peer-reviewed biological journal. For more information on TPA, see http://www.ncbi.nlm.nih.gov/Genbank/tpa.html.
BankIt
About a third of author submissions are received through the NCBIs Web-based data submission tool, BankIt (http://www.ncbi.nlm.nih.gov/BankIt). Using BankIt, authors enter sequence information directly into a form, edit as necessary and add biological annotation such as coding regions or mRNA features. Free-form text boxes, list boxes and pull-down menus allow the submitter to further describe the sequence without having to learn formatting rules or use restricted vocabularies. BankIt validates submissions, flagging many common errors, and checks for vector contamination using a variant of BLAST called Vecscreen, before creating a draft record in GenBank flat file format for the submitter to review. BankIt is the tool of choice for simple submissions, especially when only one or a small number of records is to be submitted (7). BankIt can also be used by submitters to update their existing GenBank records.
Sequin
The NCBI has developed a stand-alone multi-platform submission program called Sequin (http://www.ncbi.nlm.nih. gov/Sequin/index.html) that can be used interactively with other NCBI sequence retrieval and analysis tools. Sequin handles simple sequences such as a cDNA, as well as segmented entries, phylogenetic studies, population studies, mutation studies, environmental samples and alignments for which BankIt and other web-based submission tools are not well suited. Sequin has convenient editing and complex annotation capabilities and contains a number of built-in validation functions for quality assurance. In addition, Sequin is able to accommodate large sequence records, such as the Escherichia coli genome of 5.6 Mb, and read in a full complement of annotations via simple tables. Versions for Macintosh, PC and Unix computers are available via anonymous FTP at ftp.ncbi.nih.gov in the sequin directory. Once a submission is completed, submitters can email the Sequin file to the address: gb-sub{at}ncbi.nlm.nih.gov.
| RETRIEVING GENBANK DATA |
|---|
|
|
|---|
The ENTREZ system
The sequence records in GenBank are accessible via Entrez (http://www.ncbi.nlm.nih.gov/Entrez/), a robust and flexible database retrieval system that accesses DNA and protein sequence data, genome mapping data, population sets, phylogenetic sets, environmental sample sets, gene expression data, the NCBI taxonomy, protein domain information, protein structures from the Molecular Modeling Database, MMDB (10) and MEDLINE references via PubMed. The Entrez sequence databases are taken from a variety of sources and therefore include more sequence data than are available within the GenBank DNA sequence database alone.
BLAST sequence-similarity searching
Sequence-similarity searches are the most frequent and basic type of analysis performed on the GenBank data. NCBI offers the BLAST (http://www.ncbi.nlm.nih.gov/BLAST/) family of programs to locate regions of similarity between a query sequence and database sequences (11,12). BLAST searches may be performed on the NCBIs website, or using a set of stand-alone programs distributed by FTP. BLAST is discussed in more detail in a separate article in this issue (4).
Obtaining GenBank by FTP
NCBI distributes the GenBank releases in the traditional flat-file format as well as in the Abstract Syntax Notation (ASN.1) format used for internal maintenance. The full bimonthly GenBank release and the daily updates, which also incorporate sequence data from EMBL and DDBJ, are available by anonymous FTP from the NCBI at ftp.ncbi.nih.gov as well as from two mirror sites, at the San Diego SuperComputer Center (ftp://genbank.sdsc.edu/pub/) and at the University of Indiana (ftp://bio-mirror.net/biomirror/genbank/). The full release in flat-file format is available as compressed files in the directory, genbank with a non-cumulative set of updates contained in daily-nc. A script is provided on the FTP site to convert a set of daily updates into a cumulative update.
| MAILING ADDRESS |
|---|
|
|
|---|
GenBank, National Center for Biotechnology Information, Building 38A, Room 8S-803, 8600 Rockville Pike, Bethesda, MD 20894, USA. Tel: +1 301 496 2475; Fax: +1 301 480 9241.
| ELECTRONIC ADDRESSES |
|---|
|
|
|---|
http://www.ncbi.nlm.nih.gov/ (NCBI home page), gb-sub{at}ncbi.nlm.nih.gov (submission of sequence data to GenBank), update{at}ncbi.nlm.nih.gov (revisions to GenBank entries and notification of release of confidential entries), info{at}ncbi.nlm.nih.gov (general information about NCBI and services).
| CITING GENBANK |
|---|
|
|
|---|
If you use GenBank as a tool in your published research, we ask that this paper be cited.
| REFERENCES |
|---|
|
|
|---|
- Benson,D.A., Karsch-Mizrachi,I., Lipman,D.J., Ostell,J. and Wheeler,D.L. (2003) GenBank. Nucleic Acids Res., 31, 2327.
[Abstract/Free Full Text] - Stoesser,G., Baker,W., van den Broek,A., Garcia-Pastor,M., Kanz,C., Kulikova,T., Leinonen,R., Lin,Q., Lombard,V., Lopez,R. et al. (2003) The EMBL nucleotide sequence database. Nucleic Acids Res., 31, 1722.
[Abstract/Free Full Text] - Miyazaki,S., Sugawara,H., Gojobori,T. and Tateno,Y. (2003) DNA Data Bank of Japan (DDBJ) for genome scale research in life science. Nucleic Acids Res., 31, 2327.
- Wheeler,D.L., Church,D.M., Federhen,S., Lash,A.E., Madden,T.L., Pontius,J.U., Schuler,G.D., Schriml,L.M., Sequeira,E., Tatusova,T.A. et al. (2003) Database resources of the National Center for Biotechnology. Nucleic Acids Res., 31, 2833.
[Abstract/Free Full Text] - Boguski,M.S., Lowe,T.M. and Tolstoshev,C.M. (1993) dbESTdatabase for expressed sequence tags. Nature Genet., 4, 332333.[CrossRef][Web of Science][Medline]
- Smith,M.W., Holmsen,A.L., Wei,Y.H., Peterson,M. and Evans,G.A. (1994) Genomic sequence sampling: a strategy for high resolution sequence-based physical mapping of complex genomes. Nature Genet., 7, 4047.[CrossRef][Web of Science][Medline]
- Kans,J.A. and Ouellette,B.F.F. (2001) Submitting DNA sequences to the databases. In Baxevanis,A. and Ouellette,B.F.F. (eds), Bioinformatics: A Practical Guide to the Analysis of Genes and Proteins. John Wiley and Sons, Inc., New York, pp. 6581.
- Hayashizaki,Y. (2001) Functional annotation of a full-length mouse cDNA collection. Nature, 409, 685690.[CrossRef][Medline]
- Ouellette,B.F.F. and Boguski,M.S. (1997) Database divisions and homology search files: a guide for the perplexed. Genome Res., 7, 952957.
[Free Full Text] - Chen,J., Anderson,J.B., DeWeese-Scott,C., Fedorova,N.D., Geer,L.Y., He,S., Hurwitz,D.I., Jackson,J.D., Jacobs,A.R., Lanczycki,C.J. et al. (2003) MMDB: Entrezs 3D-structure database. Nucleic Acids Res., 31, 474477.
[Abstract/Free Full Text] - Altschul,S.F, Madden,T.L., Schaffer,A.A., Zhang,J., Miller,W. and Lipman,D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 33893402.
[Abstract/Free Full Text] - Zhang,Z., Schaffer, A.A., Miller,W., Madden,T.L., Lipman,D.J., Koonin,E.V. and Altschul,S.F. (1998) Protein sequence similarity searches using patterns as seeds. Nucleic Acids Res., 26, 39863991.
[Abstract/Free Full Text]
This article has been cited by other articles:
![]() |
D. A. Caron, P. D. Countway, P. Savai, R. J. Gast, A. Schnetzer, S. D. Moorthi, M. R. Dennett, D. M. Moran, and A. C. Jones Defining DNA-Based Operational Taxonomic Units for Microbial-Eukaryote Ecology Appl. Envir. Microbiol., September 15, 2009; 75(18): 5797 - 5808. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. H. Zheng, D. Rengaraj, J. W. Choi, K. J. Park, S. I. Lee, and J. Y. Han Expression pattern of meiosis associated SYCP family members during germline development in chickens Reproduction, September 1, 2009; 138(3): 483 - 492. [Abstract] [Full Text] [PDF] |
||||
![]() |
Hongling Du and H. S. Taylor Reviews: Stem Cells and Female Reproduction Reproductive Sciences, February 1, 2009; 16(2): 126 - 139. [Abstract] [PDF] |
||||
![]() |
B. L. Cantarel, P. M. Coutinho, C. Rancurel, T. Bernard, V. Lombard, and B. Henrissat The Carbohydrate-Active EnZymes database (CAZy): an expert resource for Glycogenomics Nucleic Acids Res., January 1, 2009; 37(suppl_1): D233 - D238. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Juste, B. Lievens, I. Frans, T. L. Marsh, M. Klingeberg, C. W. Michiels, and K. A. Willems Genetic and physiological diversity of Tetragenococcus halophilus strains isolated from sugar- and salt-rich environments Microbiology, September 1, 2008; 154(9): 2600 - 2610. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Ommedal and T. Torsvik Desulfotignum toluenicum sp. nov., a novel toluene-degrading, sulphate-reducing bacterium isolated from an oil-reservoir model column Int J Syst Evol Microbiol, December 1, 2007; 57(12): 2865 - 2869. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. R. Dreszer, G. D. Wall, D. Haussler, and K. S. Pollard Biased clustered substitutions in the human genome: The footprints of male-driven biased gene conversion Genome Res., October 1, 2007; 17(10): 1420 - 1430. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Besemer, G. Singer, R. Limberger, A.-K. Chlup, G. Hochedlinger, I. Hodl, C. Baranyi, and T. J. Battin Biophysical Controls on Community Succession in Stream Biofilms Appl. Envir. Microbiol., August 1, 2007; 73(15): 4966 - 4974. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. J. Blackall, A. M. Bojesen, H. Christensen, and M. Bisgaard Reclassification of [Pasteurella] trehalosi as Bibersteinia trehalosi gen. nov., comb. nov. Int J Syst Evol Microbiol, April 1, 2007; 57(4): 666 - 674. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. S. Chadfield, J. P. Christensen, A. Decostere, H. Christensen, and M. Bisgaard Geno- and Phenotypic Diversity of Avian Isolates of Streptococcus gallolyticus subsp. gallolyticus (Streptococcus bovis) and Associated Diagnostic Problems J. Clin. Microbiol., March 1, 2007; 45(3): 822 - 827. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. T. Ryan and T. Sweeney Integrating Molecular Biology into the Veterinary Curriculum J Vet Med Educ, January 1, 2007; 34(5): 658 - 673. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. B. Smythe, M. J. Sanderson, and S. A. Nadler Nematode Small Subunit Phylogeny Correlates with Alignment Parameters Syst Biol, December 1, 2006; 55(6): 972 - 992. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Gattelli, M. N. Zimberlin, R. P. Meiss, L. H. Castilla, and E. C. Kordon Selection of Early-Occurring Mutations Dictates Hormone-Independent Progression in Mouse Mammary Tumor Lines J. Virol., November 15, 2006; 80(22): 11409 - 11415. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Julenius and A. G. Pedersen Protein Evolution Is Faster Outside the Cell Mol. Biol. Evol., November 1, 2006; 23(11): 2039 - 2048. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. R. Willsky, L.-H. Chi, Y. Liang, D. P. Gaile, Z. Hu, and D. C. Crans Diabetes-altered gene expression in rat skeletal muscle corrected by oral administration of vanadyl sulfate Physiol Genomics, September 14, 2006; 26(3): 192 - 201. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Semon and L. Duret Evolutionary Origin and Maintenance of Coexpressed Gene Clusters in Mammals Mol. Biol. Evol., September 1, 2006; 23(9): 1715 - 1723. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. Straka, R. T. Burkhardt, N. P. Lang, K. Z. Hadsall, and M. Y. Tsai Discordance Between N-acetyltransferase 2 Phenotype and Genotype in a Population of Hmong Subjects. J. Clin. Pharmacol., July 1, 2006; 46(7): 802 - 811. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Abascal, R. Zardoya, and D. Posada GenDecoder: genetic code prediction for metazoan mitochondria. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W389 - W393. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-Y. Yuan, J.-J. Chiou, W.-H. Tseng, C.-H. Liu, C.-K. Liu, Y.-J. Lin, H.-H. Wang, A. Yao, Y.-T. Chen, and C.-N. Hsu FASTSNP: an always up-to-date and extendable service for SNP function analysis and prioritization. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W635 - W641. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Lutteke, A. Bohne-Lang, A. Loss, T. Goetz, M. Frank, and C.-W. von der Lieth GLYCOSCIENCES.de: an Internet portal to support glycomics and glycobiology research Glycobiology, May 1, 2006; 16(5): 71R - 81R. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. N. Bennett, E. C. Holmes, M. Chirivella, D. M. Rodriguez, M. Beltran, V. Vorndam, D. J. Gubler, and W. O. McMillan Molecular evolution of dengue 2 virus in Puerto Rico: positive selection in the viral envelope accompanies clade reintroduction. J. Gen. Virol., April 1, 2006; 87(Pt 4): 885 - 893. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. D. Countway and D. A. Caron Abundance and Distribution of Ostreococcus sp. in the San Pedro Channel, California, as Revealed by Quantitative PCR Appl. Envir. Microbiol., April 1, 2006; 72(4): 2496 - 2506. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. D. Jansen, J. R. Heffelfinger, T. H. Noon, P. R. Krausman, and J. C. deVos Jr. Infectious keratoconjunctivitis in bighorn sheep, silver bell mountains, Arizona, USA. J. Wildl. Dis., April 1, 2006; 42(2): 407 - 411. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Sivakumar, C. Wilton, and L. Holm From sequences to a functional unit Physiol Genomics, March 13, 2006; 25(1): 1 - 8. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Olsen, H. Christensen, and F. M. Aarestrup Diversity and evolution of blaZ from Staphylococcus aureus and coagulase-negative staphylococci J. Antimicrob. Chemother., March 1, 2006; 57(3): 450 - 460. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Morgulis, E. M. Gertz, A. A. Schaffer, and R. Agarwala WindowMasker: window-based masker for sequenced genomes Bioinformatics, January 15, 2006; 22(2): 134 - 141. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. L. Barbosa-Morais, M. Carmo-Fonseca, and S. Aparicio Systematic genome-wide annotation of spliceosomal proteins reveals differential gene family expansion Genome Res., January 1, 2006; 16(1): 66 - 77. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Sun, S. K. Palaniswamy, T. T. Pohar, V. X. Jin, T. H.-M. Huang, and R. V. Davuluri MPromDb: an integrated resource for annotation and visualization of mammalian gene promoters and ChIP-chip experimental data Nucleic Acids Res., January 1, 2006; 34(suppl_1): D98 - D103. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Gajendran, M. D. Gonzales, A. Farmer, E. Archuleta, J. Win, M. E. Waugh, and S. Kamoun Phytophthora functional genomics database (PFGD): functional genomics of phytophthora-plant interactions Nucleic Acids Res., January 1, 2006; 34(suppl_1): D465 - D470. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Kohl, I. Paulsen, T. Laubach, A. Radtke, and A. von Haeseler HvrBase++: a phylogenetic database for primate species Nucleic Acids Res., January 1, 2006; 34(suppl_1): D700 - D704. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Jiang, S. N. Parshina, W. van Doesburg, B. P. Lomans, and A. J. M. Stams Methanomethylovorans thermophila sp. nov., a thermophilic, methylotrophic methanogen from an anaerobic reactor fed with methanol Int J Syst Evol Microbiol, November 1, 2005; 55(6): 2465 - 2470. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. F. Flint, D. Drzymalski, W. L. Montgomery, G. Southam, and E. R. Angert Nocturnal Production of Endospores in Natural Populations of Epulopiscium-Like Surgeonfish Symbionts J. Bacteriol., November 1, 2005; 187(21): 7460 - 7470. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Holzwarth, R. P. Middleton, M. Roberts, R. Mansourian, F. Raymond, and S. S. Hannah The Development of a High-Density Canine Microarray J. Hered., November 1, 2005; 96(7): 817 - 820. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. MOHANTY, S. P. T. KRISHNAN, S. SWARUP, and V. B. BAJIC Detection and Preliminary Analysis of Motifs in Promoters of Anaerobically Induced Genes of Different Plant Species Ann. Bot., September 1, 2005; 96(4): 669 - 681. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Donald and E. I. Shakhnovich Predicting specificity-determining residues in two large eukaryotic transcription factor families Nucleic Acids Res., August 5, 2005; 33(14): 4455 - 4465. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Giammarinaro, S. Leroy, J.-P. Chacornac, J. Delmas, and R. Talon Development of a New Oligonucleotide Array To Identify Staphylococcal Strains at Species Level J. Clin. Microbiol., August 1, 2005; 43(8): 3673 - 3680. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. H. Jorgensen, S. A. Crawford, and K. R. Fiebelkorn Susceptibility of Neisseria meningitidis to 16 Antimicrobial Agents and Characterization of Resistance Mechanisms Affecting Some Agents J. Clin. Microbiol., July 1, 2005; 43(7): 3162 - 3171. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Besemer and M. Borodovsky GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses Nucleic Acids Res., July 1, 2005; 33(suppl_2): W451 - W454. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Wernersson FeatureExtract--extraction of sequence annotation made easy Nucleic Acids Res., July 1, 2005; 33(suppl_2): W567 - W569. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Chen, K. Abbey, W.-j. Deng, and M.-c. Cheng The bioinformatics resource for oral pathogens Nucleic Acids Res., July 1, 2005; 33(suppl_2): W734 - W740. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. L. Hekkelman and G. Vriend MRS: a fast and compact retrieval system for biological data Nucleic Acids Res., July 1, 2005; 33(suppl_2): W766 - W769. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. T. Sylvester, S. K. R. Karnati, Z. Yu, C. J. Newbold, and J. L. Firkins Evaluation of a Real-Time PCR Assay Quantifying the Ruminal Pool Size and Duodenal Flow of Protozoal Nitrogen J Dairy Sci, June 1, 2005; 88(6): 2083 - 2095. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. V. Kriventseva, A. C. Koutsos, C. Blass, F. C. Kafatos, G. K. Christophides, and E. M. Zdobnov AnoEST: Toward A. gambiae functional genomics Genome Res., June 1, 2005; 15(6): 893 - 899. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Donald and E. I. Shakhnovich Determining functional specificity from protein sequences Bioinformatics, June 1, 2005; 21(11): 2629 - 2635. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Sharov, D. B. Dudekula, and M. S.H. Ko Genome-wide assembly and analysis of alternative transcripts in mouse Genome Res., May 1, 2005; 15(5): 748 - 754. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Hao, W.-Z. He, Y. Huang, L.-X. Ma, Y. Xu, H. Xi, C. Wang, B.-S. Liu, J.-M. Wang, Y.-X. Li, et al. MPSS: an integrated database system for surveying a set of proteins Bioinformatics, May 1, 2005; 21(9): 2142 - 2143. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Pampanwar, F. Engler, J. Hatfield, S. Blundy, G. Gupta, and C. Soderlund FPC Web Tools for Rice, Maize, and Distribution Plant Physiology, May 1, 2005; 138(1): 116 - 126. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. A. Crawford, K. R. Fiebelkorn, J. E. Patterson, and J. H. Jorgensen International Clone of Neisseria meningitidis Serogroup A with Tetracycline Resistance Due to tet(B) Antimicrob. Agents Chemother., March 1, 2005; 49(3): 1198 - 1200. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Heathcote, C. Braybrook, L. Abushaban, M. Guy, M. E. Khetyar, M. A. Patton, N. D. Carter, P. J. Scambler, and P. Syrris Common arterial trunk associated with a homeodomain mutation of NKX2.6 Hum. Mol. Genet., March 1, 2005; 14(5): 585 - 593. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Via, A. Zanzoni, and M. Helmer-Citterich Seq2Struct: a resource for establishing sequence-structure links Bioinformatics, February 15, 2005; 21(4): 551 - 553. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P. Wares and C. W. Cunningham Diversification Before the Most Recent Glaciation in Balanus glandula Biol. Bull., February 1, 2005; 208(1): 60 - 68. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Yamazaki and P. Jaiswal Biological Ontologies in Rice Databases. An Introduction to the Activities in Gramene and Oryzabase Plant Cell Physiol., January 15, 2005; 46(1): 63 - 68. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Kanz, P. Aldebert, N. Althorpe, W. Baker, A. Baldwin, K. Bates, P. Browne, A. van den Broek, M. Castro, G. Cochrane, et al. The EMBL Nucleotide Sequence Database Nucleic Acids Res., January 1, 2005; 33(suppl_1): D29 - D33. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Benson, I. Karsch-Mizrachi, D. J. Lipman, J. Ostell, and D. L. Wheeler GenBank Nucleic Acids Res., January 1, 2005; 33(suppl_1): D34 - D38. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Liu, B. Bai, G. Skogerbo, L. Cai, W. Deng, Y. Zhang, D. Bu, Y. Zhao, and R. Chen NONCODE: an integrated knowledge database of non-coding RNAs Nucleic Acids Res., January 1, 2005; 33(suppl_1): D112 - D115. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Heger, C. A. Wilton, A. Sivakumar, and L. Holm ADDA: a domain database with global coverage of the protein universe Nucleic Acids Res., January 1, 2005; 33(suppl_1): D188 - D191. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. Roberts, T. Vincze, J. Posfai, and D. Macelis REBASE--restriction enzymes and DNA methyltransferases Nucleic Acids Res., January 1, 2005; 33(suppl_1): D230 - D232. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Giudicelli, D. Chaume, and M.-P. Lefranc IMGT/GENE-DB: a comprehensive database for human and mouse immunoglobulin and T cell receptor genes Nucleic Acids Res., January 1, 2005; 33(suppl_1): D256 - D261. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Kersey, L. Bower, L. Morris, A. Horne, R. Petryszak, C. Kanz, A. Kanapin, U. Das, K. Michoud, I. Phan, et al. Integr8 and Genome Reviews: integrated views of complete genomes and proteomes Nucleic Acids Res., January 1, 2005; 33(suppl_1): D297 - D302. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Chen, J. Yang, J. Yu, Z. Yao, L. Sun, Y. Shen, and Q. Jin VFDB: a reference database for bacterial virulence factors Nucleic Acids Res., January 1, 2005; 33(suppl_1): D325 - D328. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Wang, Q. Xia, X. He, M. Dai, J. Ruan, J. Chen, G. Yu, H. Yuan, Y. Hu, R. Li, et al. SilkDB: a knowledgebase for silkworm biology and genomics Nucleic Acids Res., January 1, 2005; 33(suppl_1): D399 - D402. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Wang, X. He, J. Ruan, M. Dai, J. Chen, Y. Zhang, Y. Hu, C. Ye, S. Li, L. Cong, et al. ChickVD: a sequence variation database for the chicken genome Nucleic Acids Res., January 1, 2005; 33(suppl_1): D438 - D441. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Petersen, P. Johnson, L. Andersson, K. Klinga-Levan, P. M. Gomez-Fabre, and F. Stahl RatMap--rat genome tools and data Nucleic Acids Res., January 1, 2005; 33(suppl_1): D492 - D494. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Roth, M. J. Betts, P. Steffansson, G. Saelensminde, and D. A. Liberles The Adaptive Evolution Database (TAED): a phylogeny based tool for comparative genomics Nucleic Acids Res., January 1, 2005; 33(suppl_1): D495 - D497. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. J. Kang, K. O. Choi, B.-D. Kim, S. Kim, and Y. J. Kim FESD: a Functional Element SNPs Database in human Nucleic Acids Res., January 1, 2005; 33(suppl_1): D518 - D522. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. D. Gonzales, E. Archuleta, A. Farmer, K. Gajendran, D. Grant, R. Shoemaker, W. D. Beavis, and M. E. Waugh The Legume Information System (LIS): an integrated information resource for comparative legume biology Nucleic Acids Res., January 1, 2005; 33(suppl_1): D660 - D665. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Bielinska, J. Lu, D. Sturgill, and B. Oliver Core Promoter Sequences Contribute to ovo-B Regulation in the Drosophila melanogaster Germline Genetics, January 1, 2005; 169(1): 161 - 172. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. T. Sylvester, S. K. R. Karnati, Z. Yu, M. Morrison, and J. L. Firkins Development of an Assay to Quantify Rumen Ciliate Protozoal Biomass in Cows Using Real-Time PCR J. Nutr., December 1, 2004; 134(12): 3378 - 3384. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Nelson Phage Taxonomy: We Agree To Disagree J. Bacteriol., November 1, 2004; 186(21): 7029 - 7031. [Full Text] [PDF] |
||||
![]() |
K. Becker, D. Harmsen, A. Mellmann, C. Meier, P. Schumann, G. Peters, and C. von Eiff Development and Evaluation of a Quality-Controlled Ribosomal Sequence Database for 16S Ribosomal DNA-Based Identification of Staphylococcus Species J. Clin. Microbiol., November 1, 2004; 42(11): 4988 - 4995. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Qi, H. Luo, and B. Hao CVTree: a phylogenetic tree reconstruction tool based on whole genomes Nucleic Acids Res., July 1, 2004; 32(suppl_2): W45 - W47. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Casillas and A. Barbadilla PDA: a pipeline to explore and estimate polymorphism in large DNA databases Nucleic Acids Res., July 1, 2004; 32(suppl_2): W166 - W169. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Lenffer, P. Lai, W. El Mejaber, A. M. Khan, J. L. Y. Koh, P. T. J. Tan, S. H. Seah, and V. Brusic CysView: protein classification based on cysteine pairing patterns Nucleic Acids Res., July 1, 2004; 32(suppl_2): W350 - W355. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Giudicelli, D. Chaume, and M.-P. Lefranc IMGT/V-QUEST, an integrated software program for immunoglobulin and T cell receptor V-J and V-D-J rearrangement analysis Nucleic Acids Res., July 1, 2004; 32(suppl_2): W435 - W440. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. B. Patel, R. J. Wallace Jr., B. A. Brown-Elliott, T. Taylor, C. Imperatrice, D. G. B. Leonard, R. W. Wilson, L. Mann, K. C. Jost, and I. Nachamkin Sequence-Based Identification of Aerobic Actinomycetes J. Clin. Microbiol., June 1, 2004; 42(6): 2530 - 2540. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





























