| Nucleic Acids Research | Pages |
Codon usage tabulated from the international DNA sequence databases; its status 1999
Introduction
Description Of The Database
Distribution And Access
Acknowledgements
References
Codon usage tabulated from the international DNA sequence databases; its status 1999
ABSTRACT
INTRODUCTION
The choice among synonymous codons within a genome is not random. Among bacterial genes, there is a major trend of codon choice pattern. By measuring the transfer RNA content of a cell, it has been shown that the codon usage trend is highly correlated to the isoaccepting transfer RNA population of individual organisms. It has also been found that the extent of codon bias for each gene is related to the protein production level of each gene (1,2).
In higher organisms, such as mammals, codon usage among genes is highly variable. Codon choice patterns mainly reflect the G+C content of the whole genome or local characteristics, namely GC mosaic or isochore (1,3). Research of the intra-species variations of codon usage may provide an interesting line of investigation regarding the evolution of the genome.
To evaluate codon usage for each gene and/or codon choice trend(s) for each genome, in 1986 we began to compile codon usage of protein genes contained within the international DNA sequence database (4). We named the database CUTG (codon usage tabulated from GenBank). The basic aim of the database is to provide an electronic dataset for codon usage-based analyses. Since each codon usage for a protein-coding gene is compiled as a simple double-lined entry, it is easy to import worksheets or to parse and calculate with computer languages such as C or Perl.
DESCRIPTION OF THE DATABASE
CUTG consists of lists of the codon usage of genes and the sum of codon use for each organism. As of September 1998, CUTG contains 206 526 genes for 7434 organisms. The database has been compiled using the nucleotide sequence obtained from the latest major release of the GenBank sequence database (5). The divisions representing taxonomical collection were used.
In selecting protein-coding sequences we used the annotations from feature tables of the GenBank flat file. Partially-sequenced protein genes were not included in the compilation. Codons that contained one or more letters representing ambiguous bases were excluded from the count. The data structure for each file is the same as in the previous compilation and described in the CODON_LABEL file on distribution sites.
DISTRIBUTION AND ACCESS
A complete form of the database is available from the following URLs:
| (i) | DDBJ | ftp://ftp.nig.ac.jp/pub/db/codon/current/ | ||
| (ii) | DISC | ftp://ftp.dna.affrc.go.jp/pub/codon/current/ | ||
| (iii) | EBI | ftp://ftp.ebi.ac.uk/pub/databases/cutg/ |
Files named gb***.codon, where the `***' is a division name in lower case letters (e.g., bct; pri1 and pri2 is combined as pri), list the codon use in each gene registered in the GenBank flat files. An entry for a gene has two lines. The first line consists of the following information delineated by a backslash which is extracted from the feature table for defining each protein coding sequence. In the `species' directory, there are codon usage files collected for each organism. The file name consists of the Latin name of the species which is concatenated using under bar, dot and division name (e.g., Arabidopsis thaliana, file name for species is `Arabidopsis_thaliana.pln').
A most user-friendly interface to use interactively with CUTG is to access the World-Wide Web server on DISC. A dataset for each organism is made searchable through the site: http://www.dna.affrc.go.jp/~nakamura/CUTG.html
ACKNOWLEDGEMENTS
We wish to thank Dr Y. Ugawa at the DNA Information and Stock Center, National Institute of Agrobiological Resources for his help in constructing and distributing the database. This work was supported in part by a grant-in-aid for databases from the Ministry of Education, Science, Sports and Culture of Japan. Y.N. is supported by the Kazusa DNA Research Institute Foundation.
REFERENCES
This article has been cited by other articles:
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 9 Dec 1998
Copyright©Oxford University Press, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
![]()
![]()

![]()
![]()
![]()
J. Tellam, C. Smith, M. Rist, N. Webb, L. Cooper, T. Vuocolo, G. Connolly, D. C. Tscharke, M. P. Devoy, and R. Khanna
From the Cover: Regulation of protein translation through mRNA structure influences MHC class I loading and T cell recognition
PNAS,
July 8, 2008;
105(27):
9319 - 9324.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
G. Wu, Y. Zheng, I. Qureshi, H. T. Zin, T. Beck, B. Bulka, and S. J. Freeland
SGDB: a database of synthetic genes re-designed for optimizing protein over-expression
Nucleic Acids Res.,
January 12, 2007;
35(suppl_1):
D76 - D79.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
X. LI, R. HIRANO, H. TAGAMI, and H. AIBA
Protein tagging at rare codons is caused by tmRNA action at the 3' end of nonstop mRNA generated in response to ribosome stalling
RNA,
February 1, 2006;
12(2):
248 - 255.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
I. Lemm and J. Ross
Regulation of c-myc mRNA Decay by Translational Pausing in a Coding Region Instability Determinant
Mol. Cell. Biol.,
June 15, 2002;
22(12):
3959 - 3969.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
T. Nagata, T. Aoshi, M. Suzuki, M. Uchijima, Y.-H. Kim, Z. Yang, and Y. Koide
Induction of Protective Immunity to Listeria monocytogenes by Immunization with Plasmid DNA Expressing a Helper T-Cell Epitope That Replaces the Class II-Associated Invariant Chain Peptide of the Invariant Chain
Infect. Immun.,
May 1, 2002;
70(5):
2676 - 2680.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. L. Narum, S. Kumar, W. O. Rogers, S. R. Fuhrmann, H. Liang, M. Oakley, A. Taye, B. K. L. Sim, and S. L. Hoffman
Codon Optimization of Gene Fragments Encoding Plasmodium falciparum Merzoite Proteins Enhances DNA Vaccine Protein Expression and Immunogenicity in Mice
Infect. Immun.,
December 1, 2001;
69(12):
7250 - 7253.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
H. Kuroda and P. Maliga
Complementarity of the 16S rRNA penultimate stem with sequences downstream of the AUG destabilizes the plastid mRNAs
Nucleic Acids Res.,
February 15, 2001;
29(4):
970 - 975.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
S. M. Whitney and T. J. Andrews
The Gene for the Ribulose-1,5-Bisphosphate Carboxylase/Oxygenase (Rubisco) Small Subunit Relocated to the Plastid Genome of Tobacco Directs the Synthesis of Small Subunits That Assemble into Rubisco
PLANT CELL,
January 1, 2001;
13(1):
193 - 205.
[Abstract]
[Full Text]
![]()
![]()
![]()

![]()
![]()
![]()
H. Kuroda and P. Maliga
Sequences Downstream of the Translation Initiation Codon Are Important Determinants of Translation Efficiency in Chloroplasts
Plant Physiology,
January 1, 2001;
125(1):
430 - 436.
[Abstract]
[Full Text]
![]()
![]()
![]()

![]()
![]()
![]()
M. Grynberg, J. Topczewski, A. Godzik, and A. Paszewski
The Aspergillus nidulans cysA gene encodes a novel type of serine O-acetyltransferase which is homologous to homoserine O-acetyltransferases
Microbiology,
October 1, 2000;
146(10):
2695 - 2703.
[Abstract]
[Full Text]
![]()
![]()
![]()

![]()
![]()
![]()
F. De Amicis and S. Marchetti
Intercodon dinucleotides affect codon choice in plant genes
Nucleic Acids Res.,
September 1, 2000;
28(17):
3339 - 3345.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
C.-C. Chang, J. R. Gilsdorf, V. J. DiRita, and C. F. Marrs
Identification and Genetic Characterization of Haemophilus influenzae Genetic Island 1
Infect. Immun.,
May 1, 2000;
68(5):
2630 - 2637.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
E. K. Davies, A. D. Peters, and P. D. Keightley
High Frequency of Cryptic Deleterious Mutations in Caenorhabditis elegans
Science,
September 10, 1999;
285(5434):
1748 - 1751.
[Abstract]
[Full Text]
![]()
![]()
![]()

![]()
![]()
![]()
J. Royo, E. Gomez, and G. Hueros
A Maize Homologue of the Bacterial CMP-3-Deoxy-D-manno-2-octulosonate (KDO) Synthetases. SIMILAR PATHWAYS OPERATE IN PLANTS AND BACTERIA FOR THE ACTIVATION OF KDO PRIOR TO ITS INCORPORATION INTO OUTER CELLULAR ENVELOPES
J. Biol. Chem.,
August 4, 2000;
275(32):
24993 - 24999.
[Abstract]
[Full Text]
[PDF]
![]()
This Article ![]()
![]()
Abstract
![]()
Print PDF (18K)
![]()
Alert me when this article is cited
![]()
Alert me if a correction is posted
![]()
Services ![]()
![]()
Email this article to a friend
![]()
Similar articles in this journal
![]()
Similar articles in ISI Web of Science
![]()
Similar articles in PubMed
![]()
Alert me to new issues of the journal
![]()
Add to My Personal Archive
![]()
Download to citation manager
![]()
Search for citing articles in:
ISI Web of Science (53)
![]()
Request Permissions ![]()
Commercial Re-use Guidelines
for Open Access NAR Content
![]()
Google Scholar ![]()
![]()
Articles by Nakamura, Y.
![]()
Articles by Ikemura, T.
![]()
Search for Related Content
![]()
PubMed ![]()
![]()
PubMed Citation
![]()
Articles by Nakamura, Y.
![]()
Articles by Ikemura, T.
![]()
Social Bookmarking ![]()
![]()
What's this?