Nucleic Acids Research Advance Access originally published online on December 17, 2007
Nucleic Acids Research 2008 36(3):861-871; doi:10.1093/nar/gkm1102
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2008, Vol. 36, No. 3 861-871
© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Genomics |
Diversity of preferred nucleotide sequences around the translation initiation codon in eukaryote genomes
1Department of Systems Biology, School of Biomedical Science, 2Department of Bioinformatics, Medical Research Institute, Tokyo Medical and Dental University, Yushima, Tokyo, 3Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Mishima, Shizuoka, 4Biological Information Research Center, National Institute of Advanced Industrial Science and Technology, Aomi, Tokyo and 5Department of Medical Genome Science, Graduate School of Frontier Science, University of Tokyo, Kashiwa, Chiba, Japan
*To whom correspondence should be addressed. Tel: +81 3 5803 5839; Fax: +81 3 5803 0247; Email: htanaka{at}bioinfo.tmd.ac.jp
Received August 21, 2007. Revised November 2, 2007. Accepted November 27, 2007.
Understanding regulatory mechanisms of protein synthesis in eukaryotes is essential for the accurate annotation of genome sequences. Kozak reported that the nucleotide sequence GCCGCC(A/G)CCAUGG (AUG is the initiation codon) was frequently observed in vertebrate genes and that this consensus sequence enhanced translation initiation. However, later studies using invertebrate, fungal and plant genes reported different consensus sequences. In this study, we conducted extensive comparative analyses of nucleotide sequences around the initiation codon by using genomic data from 47 eukaryote species including animals, fungi, plants and protists. The analyses revealed that preferred nucleotide sequences are quite diverse among different species, but differences between patterns of nucleotide bias roughly reflect the evolutionary relationships of the species. We also found strong biases of A/G at position –3, A/C at position –2 and C at position +5 that were commonly observed in all species examined. Genes with higher expression levels showed stronger signals, suggesting that these nucleotides are responsible for the regulation of translation initiation. The diversity of preferred nucleotide sequences around the initiation codon might be explained by differences in relative contributions from two distinct patterns, GCCGCCAUG and AAAAAAAUG, which implies the presence of multiple molecular mechanisms for controlling translation initiation.