| Nucleic Acids Research | Pages |
Codon usage tabulated from the international DNA sequence databases
Acknowledgements
References
Codon usage tabulated from the international DNA sequence databases
ABSTRACT
CUTG consists of lists of the codon usage of genes and the sum of codon use for each organism. In September 1997, CUTG contained 155 623 genes for 6048 organisms. The database has been compiled using the nucleotide sequence obtained from the latest major release of GenBank sequence database (1). The divisions which have been used are pri (primate), rod (rodent), mam (other mammalian), vrt (other vertebrate), inv (invertebrate), pln (plant), bct (bacterial), vrl (viral) and phg (phage). Other divisions that do not represent taxonomical collection (such as sts for STS or syn for synthesized sequences) have been excluded from the compilation. In selecting protein coding sequences we relied on the feature tables of GenBank flat file. Codons that contain one or more letter for ambiguous bases were excluded from the count. Partially sequenced protein genes were not compiled.
Files of the database are available by anonymous ftp. Files named gb***.codon, where the `***' is a division name in GenBank (e.g. bct), list the codon use in each gene registered in the GenBank flat files. An entry for a gene has two lines. The first line consists of following information separated by backslash which is extracted from feature table for defining each CDS (protein coding sequence). If a LOCUS contains more than one gene, the symbol # followed by a number is added after the LOCUS name; the numbers represent the order of the CDS registered in the feature table. The second line consists of the count of codons in the CDS. The order of the codons in the table is the same as in the previous compilation (2) and described in the CODON_LABEL file.
To show the characteristics of codon use of a wide range of species, as well as viruses and organella, the codon use in each organism was summed up. Files named gb***.spsum list the sum of numbers of codon use in each organism. An entry for an organism has two lines. The first line consists of latin name of the organism and number of CDS used in summing up. The second line consists of the sum of codons for the organism and its order is the same as gb***.codon files.
The complete form of the database is available from the following URLs:
(i) DDBJ (DNA Data Bank of Japan, National Institute of Genetics, Mishima Japan) ftp://ftp.nig.ac.jp/pub/db/codon/current/
(ii) DISC (DNA Information and Stock Center, National Institute of Agrobiological Resources, Tsukuba, Japan) ftp://ftp.dna.affrc.go.jp/pub/codon/current/
(iii) EBI (European Bioinformatics Institute, Cambridge, UK) ftp://ftp.ebi.ac.uk/pub/databases/cutg/
Split file for each organism is made searchable by latin name of the organism through a site http://www.dna.affrc.go.jp/~nakamura/CUTG.html . Comments on the database can be sent to cutg@lab.nig.ac.jp
ACKNOWLEDGEMENTS
We wish to thank Dr Y.Ugawa at the DNA Information and Stock Center, National Institute of Agrobiological Resources for help in constructing and distributing the database. This work was supported in part by a grant-in-aid for databases from the Ministry of Education, Science, Sports and Culture of Japan. Y.N. is supported by the Kazusa DNA Research Institute Foundation.
REFERENCES
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals Comments and feedback: www-admin{at}oup.co.uk
Last modification: 17 Dec 1997
Copyright© Oxford University Press, 1998.
This article has been cited by other articles:
![]() |
R. Boyce, P. Chilana, and T. M. Rose iCODEHOP: a new interactive program for designing COnsensus-DEgenerate Hybrid Oligonucleotide Primers from multiply aligned protein sequences Nucleic Acids Res., July 1, 2009; 37(suppl_2): W222 - W228. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Castillo, L. E. Eguiarte, and V. Souza A genomic population genetics analysis of the pathogenic enterocyte effacement island in Escherichia coli: The search for the unit of selection PNAS, February 1, 2005; 102(5): 1542 - 1547. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Mai-Prochnow, F. Evans, D. Dalisay-Saludes, S. Stelzer, S. Egan, S. James, J. S. Webb, and S. Kjelleberg Biofilm Development and Cell Death in the Marine Bacterium Pseudoalteromonas tunicata Appl. Envir. Microbiol., June 1, 2004; 70(6): 3232 - 3238. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Kadouri, S. Burdman, E. Jurkevitch, and Y. Okon Identification and Isolation of Genes Involved in Poly({beta}-Hydroxybutyrate) Biosynthesis in Azospirillum brasilense and Characterization of a phbC Mutant Appl. Envir. Microbiol., June 1, 2002; 68(6): 2943 - 2949. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Zeeberg Shannon Information Theoretic Computation of Synonymous Codon Usage Biases in Coding Regions of Human and Mouse Genomes Genome Res., June 1, 2002; 12(6): 944 - 955. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. B. Carlini, Y. Chen, and W. Stephan The Relationship Between Third-Codon Position Nucleotide Content, Codon Bias, mRNA Secondary Structure and Gene Expression in the Drosophilid Alcohol Dehydrogenase Genes Adh and Adhr Genetics, October 1, 2001; 159(2): 623 - 633. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Drouault, G. Corthier, S. D. Ehrlich, and P. Renault Expression of the Staphylococcus hyicus Lipase in Lactococcus lactis Appl. Envir. Microbiol., February 1, 2000; 66(2): 588 - 598. [Abstract] [Full Text] |
||||
![]() |
A. Sabat, K. Kosowska, K. Poulsen, A. Kasprowicz, A. Sekowska, B. van den Burg, J. Travis, and J. Potempa Two Allelic Forms of the Aureolysin Gene (aur) within Staphylococcus aureus Infect. Immun., February 1, 2000; 68(2): 973 - 976. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Wierzbicka-Patynowski, S. Niewiarowski, C. Marcinkiewicz, J. J. Calvete, M. M. Marcinkiewicz, and M. A. McLane Structural Requirements of Echistatin for the Recognition of alpha vbeta 3 and alpha 5beta 1 Integrins J. Biol. Chem., December 31, 1999; 274(53): 37809 - 37814. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






