Nucleic Acids Research, 2003, Vol. 31, No. 3 1033-1037
© 2003 Oxford University Press
Identification of pseudogenes in the Drosophila melanogaster genome
Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Avenue, PO Box 208114, New Haven, CT 06520-8114, USA
*To whom correspondence should be addressed. Tel: +1 203 432 5065; Fax: +1 509 691 6906; Email: harrison{at}csb.yale.edu
Pseudogenes are copies of genes that cannot produce a protein. They can be detected from disruptions to their apparent coding sequence, caused by frameshifts and premature stop codons. They are classed as either processed pseudogenes (made by reverse transcription from an mRNA) or duplicated pseudogenes, arising from duplication in the genomic DNA and subsequent disablement. Historically, there is anecdotal evidence that the fruit fly (Drosophila melanogaster) has few pseudogenes. Investigators have linked this to a high deletion rate of genomic DNA, for which there is evidence from genetic experiments on genome size. Here, we apply a homology-based pipeline that was developed previously to identify pseudogenes in other eukaryotic genomes, to the fruit fly, so as to derive the first complete survey of its pseudogene population. We find approximately 100 pseudogenes, with at least a sixth of these as candidate processed pseudogenes. This gives a much lower proportion of pseudogenes (compared with the size of the proteome) than in the genomes of other eukaryotes for which data are available (human, nematode and budding yeast). Closest matching proteins to Drosophila pseudogenes are significantly longer than the average protein in its proteome (up to
60% more than the average proteins length), in contrast to the situation in the three other eukaryotic genomes. This may be due to the persistence of fragments of longer genes. In the fly pseudogene population, we found most pseudogenes for serine proteases (which are more abundant in the Drosophila lineage compared with the other eukaryotes), immunoglobulin-motif-containing proteins and cytochromes P450. Data on the sequences and positions of the putative pseudogenes are available at: http://www.pseudogene.org/fly. The detection of a small number of pseudogenes in the Drosophila genome and the higher mean length for the closest matching proteins to pseudogenes (possibly because remnants of genes encoding longer proteins are more likely to persist) are further evidence for a high deletion rate of genomic DNA in the fruit fly. The data are useful for molecular evolution study in Drosophila.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
T. T. Torres, M. Dolezal, C. Schlotterer, and B. Ottenwalder Expression profiling of Drosophila mitochondrial genes via deep mRNA sequencing Nucleic Acids Res., October 20, 2009; (2009) gkp856v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Watanabe, A. Takahashi, M. Itoh, and T. Takano-Shimizu Molecular Spectrum of Spontaneous de Novo Mutations in Male and Female Germline Cells of Drosophila melanogaster Genetics, March 1, 2009; 181(3): 1035 - 1043. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Bhutkar, S. M. Russo, T. F. Smith, and W. M. Gelbart Genome-scale analysis of positionally relocated genes Genome Res., December 1, 2007; 17(12): 1880 - 1887. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Casillas, A. Barbadilla, and C. M. Bergman Purifying Selection Maintains Highly Conserved Noncoding Sequences in Drosophila Mol. Biol. Evol., October 1, 2007; 24(10): 2222 - 2234. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Yang, C. McCart, D. J. Woods, S. Terhzaz, K. G. Greenwood, R. H. ffrench-Constant, and J. A. T. Dow A Drosophila systems approach to xenobiotic metabolism Physiol Genomics, August 20, 2007; 30(3): 223 - 231. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. D. Smith, S. Shu, C. J. Mungall, and G. H. Karpen The Release 5.1 Annotation of Drosophila melanogaster Heterochromatin Science, June 15, 2007; 316(5831): 1586 - 1591. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Karro, Y. Yan, D. Zheng, Z. Zhang, N. Carriero, P. Cayting, P. Harrrison, and M. Gerstein Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation Nucleic Acids Res., January 12, 2007; 35(suppl_1): D55 - D60. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Keller, I. C. Chintauan-Marquier, P. Veltsos, and R. A. Nichols Ribosomal DNA in the Grasshopper Podisma pedestris: Escape From Concerted Evolution Genetics, October 1, 2006; 174(2): 863 - 874. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Drouin Processed Pseudogenes Are More Abundant in Human and Mouse X Chromosomes than in Autosomes Mol. Biol. Evol., September 1, 2006; 23(9): 1652 - 1655. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Halligan and P. D. Keightley Ubiquitous selective constraints in the Drosophila genome revealed by a genome-wide interspecies comparison Genome Res., July 1, 2006; 16(7): 875 - 884. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. K. Allan, J. Du, S. A. Davies, and J. A. T. Dow Genome-wide survey of V-ATPase genes in Drosophila reveals a conserved renal phenotype for lethal alleles Physiol Genomics, July 14, 2005; 22(2): 128 - 138. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Zhang, Y. Wu, Y. Liu, and B. Han Computational Identification of 69 Retroposons in Arabidopsis Plant Physiology, June 1, 2005; 138(2): 935 - 948. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sinha and E. D. Siggia Sequence Turnover and Tandem Repeats in cis-Regulatory Modules in Drosophila Mol. Biol. Evol., April 1, 2005; 22(4): 874 - 885. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Zhang, P. M. Harrison, Y. Liu, and M. Gerstein Millions of Years of Evolution Preserved: A Comprehensive Catalog of the Processed Pseudogenes in the Human Genome Genome Res., December 1, 2003; 13(12): 2541 - 2558. [Abstract] [Full Text] [PDF] |
||||






