Nucleic Acids Research, 2003, Vol. 31, No. 18 5338-5348
© 2003 Oxford University Press
Patterns of nucleotide substitution, insertion and deletion in the human genome inferred from pseudogenes
Department of Molecular Biophysics and Biochemistry, Yale University, 266 Whitney Avenue, New Haven, CT 06520-8114, USA
*To whom correspondence should be addressed. Tel: +1 203 432 6105; Fax: +1 360 838 7861; Email: mark.gerstein{at}yale.edu
Nucleotide substitution, insertion and deletion (indel) events are the major driving forces that have shaped genomes. Using the recently identified human ribosomal protein (RP) pseudogene sequences, we have thoroughly studied DNA mutation patterns in the human genome. We analyzed a total of 1726 processed RP pseudogene sequences, comprising more than 700 000 bases. To be sure to differentiate the sequence changes occurring in the functional genes during evolution from those occurring in pseudogenes after they were fixed in the genome, we used only pseudogene sequences originating from parts of RP genes that are identical in human and mouse. Overall, we found that nucleotide transitions are more common than transversions, by roughly a factor of two. Moreover, the substitution rates amongst the 12 possible nucleotide pairs are not homogeneous as they are affected by the type of immediately neighboring nucleotides and the overall local G+C content. Finally, our dataset is large enough that it has many indels, thus allowing for the first time statistically robust analysis of these events. Overall, we found that deletions are about three times more common than insertions (3740 versus 1291). The frequencies of both these events follow characteristic powerlaw behavior associated with the size of the indel. However, unexpectedly, the frequency of 3 bp deletions (in contrast to 3 bp insertions) violates this trend, being considerably higher than that of 2 bp deletions. The possible biological implications of such a 3 bp bias are discussed.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. Neuenfeldt, A. Just, H. Betat, and M. Morl Evolution of tRNA nucleotidyltransferases: A small deletion generated CC-adding enzymes PNAS, June 10, 2008; 105(23): 7953 - 7958. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Ke, X. H.-F. Zhang, and L. A. Chasin Positive selection acting on splicing motifs reflects compensatory evolution Genome Res., April 1, 2008; 18(4): 533 - 543. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Yang and L. Zhang Performance comparison between k-tuple distance and four model-based distances in phylogenetic tree reconstruction Nucleic Acids Res., March 1, 2008; 36(5): e33 - e33. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. D. Hernandez, S. H. Williamson, and C. D. Bustamante Context Dependence, Ancestral Misidentification, and Spurious Signatures of Natural Selection Mol. Biol. Evol., August 1, 2007; 24(8): 1792 - 1800. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Brandstrom and H. Ellegren The Genomic Landscape of Short Insertion and Deletion Polymorphisms in the Chicken (Gallus gallus) Genome: A High Frequency of Deletions in Tandem Duplicates Genetics, July 1, 2007; 176(3): 1691 - 1701. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. L. Novik, J. J. Spinelli, A. C. MacArthur, K. Shumansky, P. Sipahimalani, S. Leach, A. Lai, J. M. Connors, R. D. Gascoyne, R. P. Gallagher, et al. Genetic Variation in H2AFX Contributes to Risk of Non-Hodgkin Lymphoma Cancer Epidemiol. Biomarkers Prev., June 1, 2007; 16(6): 1098 - 1106. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Zheng, A. Frankish, R. Baertsch, P. Kapranov, A. Reymond, S. W. Choo, Y. Lu, F. Denoeud, S. E. Antonarakis, M. Snyder, et al. Pseudogenes in the ENCODE regions: Consensus annotation, analysis of transcription, and evolution Genome Res., June 1, 2007; 17(6): 839 - 851. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. A. Cartwright Ngila: global pairwise alignments with logarithmic and affine gap costs Bioinformatics, June 1, 2007; 23(11): 1427 - 1428. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. R. Mackwan, G. T. Carver, J. W. Drake, and D. W. Grogan An Unusual Pattern of Spontaneous Mutations Recovered in the Halophilic Archaeon Haloferax volcanii Genetics, May 1, 2007; 176(1): 697 - 702. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. W. Messer and P. F. Arndt The Majority of Recent Short DNA Insertions in the Human Genome Are Tandem Duplications Mol. Biol. Evol., May 1, 2007; 24(5): 1190 - 1197. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Karro, Y. Yan, D. Zheng, Z. Zhang, N. Carriero, P. Cayting, P. Harrrison, and M. Gerstein Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation Nucleic Acids Res., January 12, 2007; 35(suppl_1): D55 - D60. [Abstract] [Full Text] [PDF] |
||||
![]() |
F.-C. Chen, C.-J. Chen, W.-H. Li, and T.-J. Chuang Human-specific insertions and deletions inferred from mammalian genome sequences Genome Res., January 1, 2007; 17(1): 16 - 22. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Takahashi and H. Nakashima Negative Correlation of G+C Content at Silent Substitution Sites Between Orthologous Human and Mouse Protein-Coding Sequences DNA Res, January 1, 2006; 13(4): 135 - 140. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Csuros and I. Miklos Statistical Alignment of Retropseudogenes and Their Functional Paralogs Mol. Biol. Evol., December 1, 2005; 22(12): 2457 - 2471. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. S. Balakirev, V. R. Chechetkin, V. V. Lobzin, and F. J. Ayala Entropy and GC Content in the {beta}-esterase Gene Cluster of the Drosophila melanogaster Subgroup Mol. Biol. Evol., October 1, 2005; 22(10): 2063 - 2072. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sinha and E. D. Siggia Sequence Turnover and Tandem Repeats in cis-Regulatory Modules in Drosophila Mol. Biol. Evol., April 1, 2005; 22(4): 874 - 885. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. R. Bhangale, M. J. Rieder, R. J. Livingston, and D. A. Nickerson Comprehensive identification and characterization of diallelic insertion-deletion polymorphisms in 330 human candidate genes Hum. Mol. Genet., January 1, 2005; 14(1): 59 - 69. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Touchon, A. Arneodo, Y. d'Aubenton-Carafa, and C. Thermes Transcription-coupled and splicing-coupled strand asymmetries in eukaryotic genomes Nucleic Acids Res., September 23, 2004; 32(17): 4969 - 4978. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. C. W. Goonesekere and B. Lee Frequency of gaps observed in a structurally aligned protein pair database suggests a simple gap penalty function Nucleic Acids Res., May 20, 2004; 32(9): 2838 - 2843. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. I. Castillo-Davis, F. A. Kondrashov, D. L. Hartl, and R. J. Kulathinal The Functional Genomic Distribution of Protein Divergence in Two Animal Phyla: Coevolution, Genomic Conflict, and Constraint Genome Res., May 1, 2004; 14(5): 802 - 811. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. S. Taylor, C. P. Ponting, and R. R. Copley Occurrence and Consequences of Coding Sequence Insertions and Deletions in Mammalian Genomes Genome Res., April 1, 2004; 14(4): 555 - 566. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Zhang, P. M. Harrison, Y. Liu, and M. Gerstein Millions of Years of Evolution Preserved: A Comprehensive Catalog of the Processed Pseudogenes in the Human Genome Genome Res., December 1, 2003; 13(12): 2541 - 2558. [Abstract] [Full Text] [PDF] |
||||








