Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (18K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (53)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Nakamura, Y.
Right arrow Articles by Ikemura, T.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Nakamura, Y.
Right arrow Articles by Ikemura, T.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research Pages 292-292  


Codon usage tabulated from the international DNA sequence databases; its status 1999
Introduction
Description Of The Database
   Distribution And Access
Acknowledgements
References


Codon usage tabulated from the international DNA sequence databases; its status 1999

Codon usage tabulated from the international DNA sequence databases; its status 1999

Yasukazu Nakamura*, Takashi Gojobori1 and Toshimichi Ikemura1

Laboratory of Gene Structure 2, Kazusa DNA Research Institute, 1532-3 Yana, Kisarazu, Chiba 292-0812, Japan and 1National Institute of Genetics, 1111 Yata, Mishima, Shizuoka 411-8540, Japan

Received October 5, 1998; Accepted October 8, 1998

ABSTRACT

Frequencies for each of the 206 526 complete protein-coding genes (CDS's) have been compiled from taxonomical divisions of the GenBank DNA sequence database. The sum of the codon use of 7434 organisms has also been calculated. These data files can be obtained from anonymous ftp sites of DDBJ, DISC and EBI. The list of the codon usage of genes in an organism as well as the sum of the codon usage of the organism was made searchable by the name of organism through a web site http://www.dna.affrc.go.jp/~nakamura/CUTG.html

INTRODUCTION

The choice among synonymous codons within a genome is not random. Among bacterial genes, there is a major trend of codon choice pattern. By measuring the transfer RNA content of a cell, it has been shown that the codon usage trend is highly correlated to the isoaccepting transfer RNA population of individual organisms. It has also been found that the extent of codon bias for each gene is related to the protein production level of each gene (1,2).

In higher organisms, such as mammals, codon usage among genes is highly variable. Codon choice patterns mainly reflect the G+C content of the whole genome or local characteristics, namely GC mosaic or isochore (1,3). Research of the intra-species variations of codon usage may provide an interesting line of investigation regarding the evolution of the genome.

To evaluate codon usage for each gene and/or codon choice trend(s) for each genome, in 1986 we began to compile codon usage of protein genes contained within the international DNA sequence database (4). We named the database CUTG (codon usage tabulated from GenBank). The basic aim of the database is to provide an electronic dataset for codon usage-based analyses. Since each codon usage for a protein-coding gene is compiled as a simple double-lined entry, it is easy to import worksheets or to parse and calculate with computer languages such as C or Perl.

DESCRIPTION OF THE DATABASE

CUTG consists of lists of the codon usage of genes and the sum of codon use for each organism. As of September 1998, CUTG contains 206 526 genes for 7434 organisms. The database has been compiled using the nucleotide sequence obtained from the latest major release of the GenBank sequence database (5). The divisions representing taxonomical collection were used.

In selecting protein-coding sequences we used the annotations from feature tables of the GenBank flat file. Partially-sequenced protein genes were not included in the compilation. Codons that contained one or more letters representing ambiguous bases were excluded from the count. The data structure for each file is the same as in the previous compilation and described in the CODON_LABEL file on distribution sites.

DISTRIBUTION AND ACCESS

A complete form of the database is available from the following URLs:

(i)   DDBJ   ftp://ftp.nig.ac.jp/pub/db/codon/current/
(ii)   DISC   ftp://ftp.dna.affrc.go.jp/pub/codon/current/
(iii)   EBI   ftp://ftp.ebi.ac.uk/pub/databases/cutg/

Files named gb***.codon, where the `***' is a division name in lower case letters (e.g., bct; pri1 and pri2 is combined as pri), list the codon use in each gene registered in the GenBank flat files. An entry for a gene has two lines. The first line consists of the following information delineated by a backslash which is extracted from the feature table for defining each protein coding sequence. In the `species' directory, there are codon usage files collected for each organism. The file name consists of the Latin name of the species which is concatenated using under bar, dot and division name (e.g., Arabidopsis thaliana, file name for species is `Arabidopsis_thaliana.pln').

A most user-friendly interface to use interactively with CUTG is to access the World-Wide Web server on DISC. A dataset for each organism is made searchable through the site: http://www.dna.affrc.go.jp/~nakamura/CUTG.html

ACKNOWLEDGEMENTS

We wish to thank Dr Y. Ugawa at the DNA Information and Stock Center, National Institute of Agrobiological Resources for his help in constructing and distributing the database. This work was supported in part by a grant-in-aid for databases from the Ministry of Education, Science, Sports and Culture of Japan. Y.N. is supported by the Kazusa DNA Research Institute Foundation.

REFERENCES

1. Ikemura,T. (1985) Mol. Biol. Evol., 2, 13-34. MEDLINE Abstract

2. Ikemura,T. (1981) J. Mol. Biol., 146, 1-21. MEDLINE Abstract

3. Bernardi,G., Olofsson,B., Filipski,J., Zerial,M., Salinas,J., Cuny,G., Meunier-Rotival,M. and Rodier,F. (1985) Science, 228, 953-958. MEDLINE Abstract

4. Maruyama,T., Gojobori,T., Aota,S. and Ikemura,T. (1986) Nucleic Acids Res., 14, r151-197.

5. Benson,D.A., Boguski,M.S., Lipman,D.J., Ostell,J., Ouellette,B.F.F., Rapp,B.A. and Wheeler,D.L (1999) Nucleic Acids Res 27, 12-17. MEDLINE Abstract


*To whom correspondence should be addressed. Tel: +81 438 52 3935; Fax: +81 438 52 3934; Email: ynakamu@kazusa.or.jp


This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 9 Dec 1998
Copyright©Oxford University Press, 1998.

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Proc. Natl. Acad. Sci. USAHome page
J. Tellam, C. Smith, M. Rist, N. Webb, L. Cooper, T. Vuocolo, G. Connolly, D. C. Tscharke, M. P. Devoy, and R. Khanna
From the Cover: Regulation of protein translation through mRNA structure influences MHC class I loading and T cell recognition
PNAS, July 8, 2008; 105(27): 9319 - 9324.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
G. Wu, Y. Zheng, I. Qureshi, H. T. Zin, T. Beck, B. Bulka, and S. J. Freeland
SGDB: a database of synthetic genes re-designed for optimizing protein over-expression
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D76 - D79.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
X. LI, R. HIRANO, H. TAGAMI, and H. AIBA
Protein tagging at rare codons is caused by tmRNA action at the 3' end of nonstop mRNA generated in response to ribosome stalling
RNA, February 1, 2006; 12(2): 248 - 255.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. Biol.Home page
I. Lemm and J. Ross
Regulation of c-myc mRNA Decay by Translational Pausing in a Coding Region Instability Determinant
Mol. Cell. Biol., June 15, 2002; 22(12): 3959 - 3969.
[Abstract] [Full Text] [PDF]


Home page
Infect. Immun.Home page
T. Nagata, T. Aoshi, M. Suzuki, M. Uchijima, Y.-H. Kim, Z. Yang, and Y. Koide
Induction of Protective Immunity to Listeria monocytogenes by Immunization with Plasmid DNA Expressing a Helper T-Cell Epitope That Replaces the Class II-Associated Invariant Chain Peptide of the Invariant Chain
Infect. Immun., May 1, 2002; 70(5): 2676 - 2680.
[Abstract] [Full Text] [PDF]


Home page
Infect. Immun.Home page
D. L. Narum, S. Kumar, W. O. Rogers, S. R. Fuhrmann, H. Liang, M. Oakley, A. Taye, B. K. L. Sim, and S. L. Hoffman
Codon Optimization of Gene Fragments Encoding Plasmodium falciparum Merzoite Proteins Enhances DNA Vaccine Protein Expression and Immunogenicity in Mice
Infect. Immun., December 1, 2001; 69(12): 7250 - 7253.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Kuroda and P. Maliga
Complementarity of the 16S rRNA penultimate stem with sequences downstream of the AUG destabilizes the plastid mRNAs
Nucleic Acids Res., February 15, 2001; 29(4): 970 - 975.
[Abstract] [Full Text] [PDF]


Home page
Plant CellHome page
S. M. Whitney and T. J. Andrews
The Gene for the Ribulose-1,5-Bisphosphate Carboxylase/Oxygenase (Rubisco) Small Subunit Relocated to the Plastid Genome of Tobacco Directs the Synthesis of Small Subunits That Assemble into Rubisco
PLANT CELL, January 1, 2001; 13(1): 193 - 205.
[Abstract] [Full Text]


Home page
Plant Physiol.Home page
H. Kuroda and P. Maliga
Sequences Downstream of the Translation Initiation Codon Are Important Determinants of Translation Efficiency in Chloroplasts
Plant Physiology, January 1, 2001; 125(1): 430 - 436.
[Abstract] [Full Text]


Home page
MicrobiologyHome page
M. Grynberg, J. Topczewski, A. Godzik, and A. Paszewski
The Aspergillus nidulans cysA gene encodes a novel type of serine O-acetyltransferase which is homologous to homoserine O-acetyltransferases
Microbiology, October 1, 2000; 146(10): 2695 - 2703.
[Abstract] [Full Text]


Home page
Nucleic Acids ResHome page
F. De Amicis and S. Marchetti
Intercodon dinucleotides affect codon choice in plant genes
Nucleic Acids Res., September 1, 2000; 28(17): 3339 - 3345.
[Abstract] [Full Text] [PDF]


Home page
Infect. Immun.Home page
C.-C. Chang, J. R. Gilsdorf, V. J. DiRita, and C. F. Marrs
Identification and Genetic Characterization of Haemophilus influenzae Genetic Island 1
Infect. Immun., May 1, 2000; 68(5): 2630 - 2637.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
E. K. Davies, A. D. Peters, and P. D. Keightley
High Frequency of Cryptic Deleterious Mutations in Caenorhabditis elegans
Science, September 10, 1999; 285(5434): 1748 - 1751.
[Abstract] [Full Text]


Home page
J. Biol. Chem.Home page
J. Royo, E. Gomez, and G. Hueros
A Maize Homologue of the Bacterial CMP-3-Deoxy-D-manno-2-octulosonate (KDO) Synthetases. SIMILAR PATHWAYS OPERATE IN PLANTS AND BACTERIA FOR THE ACTIVATION OF KDO PRIOR TO ITS INCORPORATION INTO OUTER CELLULAR ENVELOPES
J. Biol. Chem., August 4, 2000; 275(32): 24993 - 24999.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (18K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (53)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Nakamura, Y.
Right arrow Articles by Ikemura, T.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Nakamura, Y.
Right arrow Articles by Ikemura, T.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?