Skip Navigation

This Article
Right arrow Full Text Freely available
Right arrow Print PDF (354K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (155)
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Burset, M.
Right arrow Articles by Solovyev, V. V.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Burset, M.
Right arrow Articles by Solovyev, V. V.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2000, Vol. 28, No. 21 4364-4375
© 2000 Oxford University Press

Analysis of canonical and non-canonical splice sites in mammalian genomes

M. Burset, I. A. Seledtsov and V. V. Solovyev*

Informatic Division, The Sanger Centre, Hinxton, Cambridge, CB10 1SA, UK

A set of 43 337 splice junction pairs was extracted from mammalian GenBank annotated genes. Expressed sequence tag (EST) sequences support 22 489 of them. Of these, 98.71% contain canonical dinucleotides GT and AG for donor and acceptor sites, respectively; 0.56% hold non-canonical GC-AG splice site pairs; and the remaining 0.73% occurs in a lot of small groups (with a maximum size of 0.05%). Studying these groups we observe that many of them contain splicing dinucleotides shifted from the annotated splice junction by one position. After close examination of such cases we present a new classification consisting of only eight observed types of splice site pairs (out of 256 a priori possible combinations). EST alignments allow us to verify the exonic part of the splice sites, but many non-canonical cases may be due to intron sequencing errors. This idea is given substantial support when we compare the sequences of human genes having non-canonical splice sites deposited in GenBank by high throughput genome sequencing projects (HTG). A high proportion (156 out of 171) of the human non-canonical and EST-supported splice site sequences had a clear match in the human HTG. They can be classified after corrections as: 79 GC-AG pairs (of which one was an error that corrected to GC-AG), 61 errors that were corrected to GT-AG canonical pairs, six AT-AC pairs (of which two were errors that corrected to AT-AC), one case was produced from non-existent intron, seven cases were found in HTG that were deposited to GenBank and finally there were only two cases left of supported non-canonical splice sites. If we assume that approximately the same situation is true for the whole set of annotated mammalian non-canonical splice sites, then the 99.24% of splice site pairs should be GT-AG, 0.69% GC-AG, 0.05% AT-AC and finally only 0.02% could consist of other types of non-canonical splice sites. We analyze several characteristics of EST-verified splice sites and build weight matrices for the major groups, which can be incorporated into gene prediction programs. We also present a set of EST-verified canonical splice sites larger by two orders of magnitude than the current one (22 199 entries versus ~600) and finally, a set of 290 EST-supported non-canonical splice sites. Both sets should be significant for future investigations of the splicing mechanism.

* To whom correspondence should be addressed at present address: EOS Biotechnology, 225A Gateway Boulevard, South San Francisco, CA 94080, USA. Tel: +1 650 246 2331; Fax: +1 650 583 3881; Email: solovyev@eosbiotech.com Present addresses: M. Burset, Institut Municipal d’Investigació Mèdica (IMIM), C/Dr Aiguader 80, 08003 Barcelona, Spain I. A. Seledtsov, Institute of Cytology and Genetics, Novosibirsk, 630090, Russia


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
C. Spatuzza, M. Schiavone, E. Di Salle, E. Janda, M. Sardiello, G. Fiume, O. Fierro, M. Simonetta, N. Argiriou, R. Faraonio, et al.
Physical and functional characterization of the genetic locus of IBtk, an inhibitor of Bruton's tyrosine kinase: evidence for three protein isoforms of IBtk
Nucleic Acids Res., July 2, 2008; (2008) gkn413v1.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
D. Mandal, Z. Feng, and C. M. Stoltzfus
Gag-Processing Defect of Human Immunodeficiency Virus Type 1 Integrase E246 and G247 Mutants Is Caused by Activation of an Overlapping 5' Splice Site
J. Virol., February 1, 2008; 82(3): 1600 - 1604.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
C. Zhang, M. L. Hastings, A. R. Krainer, and M. Q. Zhang
Dual-specificity splice sites function alternatively as 5' and 3' splice sites
PNAS, September 18, 2007; 104(38): 15028 - 15033.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Bhasi, R. V. Pandey, S. P. Utharasamy, and P. Senapathy
EuSplice: a unified resource for the analysis of splice signals and alternative splicing in eukaryotic genes
Bioinformatics, July 15, 2007; 23(14): 1815 - 1823.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
G. Roma, G. Cobellis, P. Claudiani, F. Maione, P. Cruz, G. Tripoli, M. Sardiello, I. Peluso, and E. Stupka
A novel view of the transcriptome revealed from gene trapping in mouse embryonic stem cells
Genome Res., July 1, 2007; 17(7): 1051 - 1060.
[Abstract] [Full Text] [PDF]


Home page
J HeredHome page
M. Menotti-Raymond, V. A. David, A. A. Schaffer, R. Stephens, D. Wells, R. Kumar-Singh, S. J. O'Brien, and K. Narfstrom
Mutation in CEP290 Discovered for Cat Model of Human Retinal Degeneration
J. Hered., May 16, 2007; (2007) esm019v1.
[Abstract] [Full Text] [PDF]


Home page
Physiol. GenomicsHome page
D. P. Terwilliger, K. M. Buckley, D. Mehta, P. G. Moorjani, and L.C. Smith
Unexpected diversity displayed in cDNAs expressed by the immune cells of the purple sea urchin, Strongylocentrotus purpuratus
Physiol Genomics, September 14, 2006; 26(2): 134 - 144.
[Abstract] [Full Text] [PDF]


Home page
RNAHome page
C. Kyriakopoulou, P. Larsson, L. Liu, J. Schuster, F. Soderbom, L. A. Kirsebom, and A. Virtanen
U1-like snRNAs lacking complementarity to canonical 5' splice sites
RNA, September 1, 2006; 12(9): 1603 - 1611.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J.-i. Takeda, Y. Suzuki, M. Nakao, R. A. Barrero, K. O. Koyanagi, L. Jin, C. Motono, H. Hata, T. Isogai, K. Nagai, et al.
Large-scale identification and characterization of alternative splicing variants of human gene transcripts using 56 419 completely sequenced and manually annotated full-length cDNAs
Nucleic Acids Res., September 1, 2006; 34(14): 3917 - 3928.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
N. Sheth, X. Roca, M. L. Hastings, T. Roeder, A. R. Krainer, and R. Sachidanandam
Comprehensive splice-site analysis using comparative genomics
Nucleic Acids Res., September 1, 2006; 34(14): 3955 - 3967.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
Y. Zhang, X. S. Liu, Q.-R. Liu, and L. Wei
Genome-wide in silico identification and analysis of cis natural antisense transcripts (cis-NATs) in ten species
Nucleic Acids Res., July 18, 2006; 34(12): 3465 - 3475.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. Castrignano, R. Rizzi, I. G. Talamo, P. D. De Meo, A. Anselmo, P. Bonizzoni, and G. Pesole
ASPIC: a web resource for alternative splicing prediction and transcript isoforms characterization.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W440 - W443.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
D. Vigetti, M. Ori, M. Viola, A. Genasetti, E. Karousou, M. Rizzi, F. Pallotti, I. Nardi, V. C. Hascall, G. De Luca, et al.
Molecular Cloning and Characterization of UDP-glucose Dehydrogenase from the Amphibian Xenopus laevis and Its Involvement in Hyaluronan Synthesis
J. Biol. Chem., March 24, 2006; 281(12): 8254 - 8263.
[Abstract] [Full Text] [PDF]


Home page
Brief Funct Genomic ProteomicHome page
P. Bonizzoni, R. Rizzi, and G. Pesole
Computational methods for alternative splicing prediction
Brief Funct Genomic Proteomic, March 1, 2006; 5(1): 46 - 51.



Home page
Genome Res.Home page
L. Lipovich and M.-C. King
Abundant novel transcriptional units and unconventional gene pairs on human chromosome 22
Genome Res., January 1, 2006; 16(1): 45 - 54.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
M. Zhang and W. Gish
Improved spliced alignment from an information theoretic approach
Bioinformatics, January 1, 2006; 22(1): 13 - 20.
[Abstract] [Full Text] [PDF]


Home page
J. Med. Genet.Home page
K J Bradley, B M Cavaco, M R Bowl, B Harding, A Young, and R V Thakker
Utilisation of a cryptic non-canonical donor splice site of the gene encoding PARAFIBROMIN is associated with familial isolated primary hyperparathyroidism
J. Med. Genet., August 1, 2005; 42(8): e51 - e51.
[Abstract] [Full Text] [PDF]


Home page
J. Clin. Endocrinol. Metab.Home page
T. A. Ishunina, D. F. Swaab, and D. F. Fischer
Estrogen Receptor-{alpha} Splice Variants in the Medial Mamillary Nucleus of Alzheimer's Disease Patients: Identification of a Novel MB1 Isoform
J. Clin. Endocrinol. Metab., June 1, 2005; 90(6): 3757 - 3765.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Virol.Home page
T. K. W. Cheung, Y. Guan, S. S. F. Ng, H. Chen, C. H. K. Wong, J. S. M. Peiris, and L. L. M. Poon
Generation of recombinant influenza A virus without M2 ion-channel protein by introduction of a point mutation at the 5' end of the viral intron
J. Gen. Virol., May 1, 2005; 86(5): 1447 - 1454.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
A. A. Sharov, D. B. Dudekula, and M. S.H. Ko
Genome-wide assembly and analysis of alternative transcripts in mouse
Genome Res., May 1, 2005; 15(5): 748 - 754.
[Abstract] [Full Text] [PDF]


Home page
Cancer Res.Home page
Y.-K. Leung, K.-M. Lau, J. Mobley, Z. Jiang, and S.-M. Ho
Overexpression of Cytochrome P450 1A1 and Its Novel Spliced Variant in Ovarian Cancer Cells: Alternative Subcellular Enzyme Compartmentation May Contribute to Carcinogenesis
Cancer Res., May 1, 2005; 65(9): 3726 - 3734.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. D. Wu and C. K. Watanabe
GMAP: a genomic mapping and alignment program for mRNA and EST sequences
Bioinformatics, May 1, 2005; 21(9): 1859 - 1875.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Chen, N. Fossar, D. Weil, M. Guillaud-Bataille, G. Danglot, B. Raynal, F. Dautry, A. Bernheim, and O. Brison
High frequency trans-splicing in a cell line producing spliced and polyadenylated RNA polymerase I transcripts from an rDNA-myc chimeric gene
Nucleic Acids Res., April 22, 2005; 33(7): 2332 - 2342.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
Y. Miyake, T. Mizuno, K.-i. Yanagi, and F. Hanaoka
Novel Splicing Variant of Mouse Orc1 Is Deficient in Nuclear Translocation and Resistant for Proteasome-mediated Degradation
J. Biol. Chem., April 1, 2005; 280(13): 12643 - 12652.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S. Vanacova, W. Yan, J. M. Carlton, and P. J. Johnson
Spliceosomal introns in the deep-branching eukaryote Trichomonas vaginalis
PNAS, March 22, 2005; 102(12): 4430 - 4435.
[Abstract] [Full Text] [PDF]


Home page
Eukaryot CellHome page
D. M. Kupfer, S. D. Drabenstot, K. L. Buchanan, H. Lai, H. Zhu, D. W. Dyer, B. A. Roe, and J. W. Murphy
Introns and Splicing Elements of Five Diverse Fungi
Eukaryot. Cell, October 1, 2004; 3(5): 1088 - 1100.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Chen, M. Sun, W. J. Kent, X. Huang, H. Xie, W. Wang, G. Zhou, R. Z. Shi, and J. D. Rowley
Over 20% of human transcripts might form sense-antisense pairs
Nucleic Acids Res., September 8, 2004; 32(16): 4812 - 4820.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
I. Ovcharenko, D. Boffelli, and G. G. Loots
eShadow: A Tool for Comparing Closely Related Sequences
Genome Res., June 1, 2004; 14(6): 1191 - 1198.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
S. T. Eskesen, F. N. Eskesen, and A. Ruvinsky
Natural Selection Affects Frequencies of AG and GT Dinucleotides at the 5' and 3' Ends of Exons
Genetics, May 1, 2004; 167(1): 543 - 550.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
P. Stoilov, R. Daoud, O. Nayler, and S. Stamm
Human tra2-beta1 autoregulates its protein concentration by influencing alternative splicing of its pre-mRNA
Hum. Mol. Genet., March 1, 2004; 13(5): 509 - 524.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
E. Eden and S. Brunak
Analysis and recognition of 5' UTR intron splice sites in human pre-mRNA
Nucleic Acids Res., February 11, 2004; 32(3): 1131 - 1142.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Freund, C. Asang, S. Kammler, C. Konermann, J. Krummheuer, M. Hipp, I. Meyer, W. Gierling, S. Theiss, T. Preuss, et al.
A novel approach to describe a U1 snRNA binding site
Nucleic Acids Res., December 1, 2003; 31(23): 6963 - 6975.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
T. M. Eisenhaure, S. A. Francis, L. D. Willison, S. R. Coughlin, and D. J. Lerner
The Rho Guanine Nucleotide Exchange Factor Lsc Homo-oligomerizes and Is Negatively Regulated through Domains in Its Carboxyl Terminus That Are Absent in Novel Splenic Isoforms
J. Biol. Chem., August 15, 2003; 278(33): 30975 - 30984.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
N. Volfovsky, B. J. Haas, and S. L. Salzberg
Computational Discovery of Internal Micro-Exons
Genome Res., June 1, 2003; 13(6): 1216 - 1221.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
W. Zhu, S. D. Schlueter, and V. Brendel
Refined Annotation of the Arabidopsis Genome by Complete Expressed Sequence Tag Mapping
Plant Physiology, June 1, 2003; 132(2): 469 - 484.
[Abstract] [Full Text] [PDF]


Home page
Hum Mol GenetHome page
M. C. Hogan, M. D. Griffin, S. Rossetti, V. E. Torres, C. J. Ward, and P. C. Harris
PKHDL1, a homolog of the autosomal recessive polycystic kidney disease gene, encodes a receptor with inducible T lymphocyte expression
Hum. Mol. Genet., March 15, 2003; 12(6): 685 - 698.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Romani, E. Guerra, M. Trerotola, and S. Alberti
Detection and analysis of spliced chimeric mRNAs in sequence databanks
Nucleic Acids Res., February 15, 2003; 31(4): e17 - e17.
[Abstract] [Full Text] [PDF]


Home page
Arch NeurolHome page
D. D. Einum, A. M. Clark, J. J. Townsend, L. J. Ptacek, and Y.-H. Fu
A Novel Central Nervous System-Enriched Spinocerebellar Ataxia Type 7 Gene Product
Arch Neurol, January 1, 2003; 60(1): 97 - 103.
[Abstract] [Full Text] [PDF]


Home page
J. Neurosci.Home page
T. W. Soong, C. D. DeMaria, R. S. Alvania, L. S. Zweifel, M. C. Liang, S. Mittman, W. S. Agnew, and D. T. Yue
Systematic Identification of Splice Variants in Human P/Q-Type Channel alpha 12.1 Subunits: Implications for Current Density and Ca2+-Dependent Inactivation
J. Neurosci., December 1, 2002; 22(23): 10142 - 10152.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Mathe, M.-F. Sagot, T. Schiex, and P. Rouze
Current methods of gene prediction, their strengths and weaknesses
Nucleic Acids Res., October 1, 2002; 30(19): 4103 - 4117.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. Farrer, A. B. Roller, W. J. Kent, and A. M. Zahler
Analysis of the role of Caenorhabditis elegans GC-AG introns in regulated splicing
Nucleic Acids Res., August 1, 2002; 30(15): 3360 - 3367.
[Abstract] [Full Text] [PDF]


Home page
Cardiovasc ResHome page
N. Decher, O. Uyguner, C. R Scherer, B. Karaman, M. Yuksel-Apak, A. E Busch, K. Steinmeyer, and B. Wollnik
hKChIP2 is a functional modifier of hKv4.3 potassium channels: Cloning and expression of a short hKChIP2 splice variant
Cardiovasc Res, November 1, 2001; 52(2): 255 - 264.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Levine and R. Durbin
A computational scan for U12-dependent introns in the human genome sequence
Nucleic Acids Res., October 1, 2001; 29(19): 4006 - 4013.
[Abstract] [Full Text] [PDF]


Home page
J. Nutr.Home page
H. C. Erichsen, P. Eck, M. Levine, and S. Chanock
Characterization of the Genomic Structure of the Human Vitamin C Transporter SVCT1 (SLC23A2)
J. Nutr., October 1, 2001; 131(10): 2623 - 2627.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
S. C. Warner, C. Finta, and P. G. Zaphiropoulos
Intergenic Transcripts Containing a Novel Human Cytochrome P450 2C Exon 1 Spliced to Sequences from the CYP2C9 Gene
Mol. Biol. Evol., October 1, 2001; 18(10): 1841 - 1848.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. A. Thanaraj and F. Clark
Human GC-AG alternative intron isoforms with weak donor sites show enhanced consensus at acceptor exon positions
Nucleic Acids Res., June 15, 2001; 29(12): 2581 - 2593.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Burset, I. A. Seledtsov, and V. V. Solovyev
SpliceDB: database of canonical and non-canonical mammalian splice sites
Nucleic Acids Res., January 1, 2001; 29(1): 255 - 259.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.