Nucleic Acids Research, 2001, Vol. 29, No. 1 255-259
© 2001 Oxford University Press
SpliceDB: database of canonical and non-canonical mammalian splice sites
The Sanger Centre, Hinxton, Cambridge CB10 1SA, UK and 1Softberry Inc., 108 Corporate Park Drive, Suite 120, White Plains, NY 10604, USA
A database (SpliceDB) of known mammalian splice site sequences has been developed. We extracted 43 337 splice pairs from mammalian divisions of the gene-centered Infogene database, including sites from incomplete or alternatively spliced genes. Known EST sequences supported 22 815 of them. After discarding sequences with putative errors and ambiguous location of splice junctions the verified dataset includes 22 489 entries. Of these, 98.71% contain canonical GTAG junctions (22 199 entries) and 0.56% have non-canonical GCAG splice site pairs. The remainder (0.73%) occurs in a lot of small groups (with a maximum size of 0.05%). We especially studied non-canonical splice sites, which comprise 3.73% of GenBank annotated splice pairs. EST alignments allowed us to verify only the exonic part of splice sites. To check the conservative dinucleotides we compared sequences of human non-canonical splice sites with sequences from the high throughput genome sequencing project (HTG). Out of 171 human non-canonical and EST-supported splice pairs, 156 (91.23%) had a clear match in the human HTG. They can be classified after sequence analysis as: 79 GCAG pairs (of which one was an error that corrected to GCAG), 61 errors corrected to GTAG canonical pairs, six ATAC pairs (of which two were errors corrected to ATAC), one case was produced from a non-existent intron, seven cases were found in HTG that were deposited to GenBank and finally there were only two other cases left of supported non-canonical splice pairs. The information about verified splice site sequences for canonical and non-canonical sites is presented in SpliceDB with the supporting evidence. We also built weight matrices for the major splice groups, which can be incorporated into gene prediction programs. SpliceDB is available at the computational genomic Web server of the Sanger Centre: http://genomic.sanger.ac.uk/spldb/SpliceDB.html and at http://www.softberry.com/spldb/SpliceDB.html.
* To whom correspondence should be addressed at present address: EOS Biotechnology, 225A Gateway Boulevard, South San Francisco, CA 94080, USA. Tel: +1 650 246 2331; Fax: +1 650 583 3881; Email: solovyev{at}eosbiotech.com Present address: M. Burset, Institut Municipal dInvestigació Mèdica (IMIM), C/Dr Aiguader 80, 08003 Barcelona, Spain
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M.-R. Ho, W.-J. Jang, C.-h. Chen, L.-Y. Ch'ang, and W.-c. Lin Designating eukaryotic orthology via processed transcription units Nucleic Acids Res., June 1, 2008; 36(10): 3436 - 3442. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Buratti, M. Chivers, J. Kralovicova, M. Romano, M. Baralle, A. R. Krainer, and I. Vorechovsky Aberrant 5' splice sites in human disease genes: mutation pattern, nucleotide structure and comparison of computational tools that predict their utilization Nucleic Acids Res., July 26, 2007; 35(13): 4250 - 4263. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Buratti, A. Dhir, M. A. Lewandowska, and F. E. Baralle RNA structure is a key regulatory element in pathological ATM and CFTR pseudoexon inclusion events Nucleic Acids Res., July 26, 2007; 35(13): 4369 - 4383. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Bhasi, R. V. Pandey, S. P. Utharasamy, and P. Senapathy EuSplice: a unified resource for the analysis of splice signals and alternative splicing in eukaryotic genes Bioinformatics, July 15, 2007; 23(14): 1815 - 1823. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Ali, P. T. Christie, I. V. Grigorieva, B. Harding, H. Van Esch, S. F. Ahmed, M. Bitner-Glindzicz, E. Blind, C. Bloch, P. Christin, et al. Functional characterization of GATA3 mutations causing the hypoparathyroidism-deafness-renal (HDR) dysplasia syndrome: insight into mechanisms of DNA binding by the GATA3 transcription factor Hum. Mol. Genet., February 1, 2007; 16(3): 265 - 275. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Kyriakopoulou, P. Larsson, L. Liu, J. Schuster, F. Soderbom, L. A. Kirsebom, and A. Virtanen U1-like snRNAs lacking complementarity to canonical 5' splice sites RNA, September 1, 2006; 12(9): 1603 - 1611. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Sheth, X. Roca, M. L. Hastings, T. Roeder, A. R. Krainer, and R. Sachidanandam Comprehensive splice-site analysis using comparative genomics Nucleic Acids Res., September 1, 2006; 34(14): 3955 - 3967. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Vigetti, M. Ori, M. Viola, A. Genasetti, E. Karousou, M. Rizzi, F. Pallotti, I. Nardi, V. C. Hascall, G. De Luca, et al. Molecular Cloning and Characterization of UDP-glucose Dehydrogenase from the Amphibian Xenopus laevis and Its Involvement in Hyaluronan Synthesis J. Biol. Chem., March 24, 2006; 281(12): 8254 - 8263. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Weintrob, J. Drouin, S. Vallette-Kasic, E. Taub, D. Marom, Y. Lebenthal, G. Klinger, E. Bron-Harlev, and M. Shohat Low Estriol Levels in the Maternal Triple-Marker Screen as a Predictor of Isolated Adrenocorticotropic Hormone Deficiency Caused by a New Mutation in the TPIT Gene Pediatrics, February 1, 2006; 117(2): e322 - e327. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Gaffoor, D. W. Brown, R. Plattner, R. H. Proctor, W. Qi, and F. Trail Functional Analysis of the Polyketide Synthase Genes in the Filamentous Fungus Gibberella zeae (Anamorph Fusarium graminearum) Eukaryot. Cell, November 1, 2005; 4(11): 1926 - 1933. [Abstract] [Full Text] [PDF] |
||||
![]() |
K J Bradley, B M Cavaco, M R Bowl, B Harding, A Young, and R V Thakker Utilisation of a cryptic non-canonical donor splice site of the gene encoding PARAFIBROMIN is associated with familial isolated primary hyperparathyroidism J. Med. Genet., August 1, 2005; 42(8): e51 - e51. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. N. Pouchkina-Stantcheva and A. Tunnacliffe Spliced Leader RNA-Mediated trans-Splicing in Phylum Rotifera Mol. Biol. Evol., June 1, 2005; 22(6): 1482 - 1489. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. D. Wu and C. K. Watanabe GMAP: a genomic mapping and alignment program for mRNA and EST sequences Bioinformatics, May 1, 2005; 21(9): 1859 - 1875. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Tian, J. Hu, H. Zhang, and C. S. Lutz A large-scale analysis of mRNA polyadenylation of human and mouse genes Nucleic Acids Res., January 12, 2005; 33(1): 201 - 212. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. F. Abril, R. Castelo, and R. Guigo Comparison of splice sites in mammals and chicken Genome Res., January 1, 2005; 15(1): 111 - 119. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Geiszt, K. Lekstrom, and T. L. Leto Analysis of mRNA Transcripts from the NAD(P)H Oxidase 1 (Nox1) Gene: EVIDENCE AGAINST PRODUCTION OF THE NADPH OXIDASE HOMOLOG-1 SHORT (NOH-1S) TRANSCRIPT VARIANT J. Biol. Chem., December 3, 2004; 279(49): 51661 - 51668. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. M. Kupfer, S. D. Drabenstot, K. L. Buchanan, H. Lai, H. Zhu, D. W. Dyer, B. A. Roe, and J. W. Murphy Introns and Splicing Elements of Five Diverse Fungi Eukaryot. Cell, October 1, 2004; 3(5): 1088 - 1100. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Uchimura, K. Kadomatsu, F. M. El-Fasakhany, M. S. Singer, M. Izawa, R. Kannagi, N. Takeda, S. D. Rosen, and T. Muramatsu N-Acetylglucosamine 6-O-Sulfotransferase-1 Regulates Expression of L-Selectin Ligands and Lymphocyte Homing J. Biol. Chem., August 13, 2004; 279(33): 35001 - 35008. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. L. Orlov and V. N. Potapov Complexity: an internet resource for analysis of DNA sequence complexity Nucleic Acids Res., July 1, 2004; 32(suppl_2): W628 - W633. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Nesbit, M. R. Bowl, B. Harding, A. Ali, A. Ayala, C. Crowe, A. Dobbie, G. Hampson, I. Holdaway, M. A. Levine, et al. Characterization of GATA3 Mutations in the Hypoparathyroidism, Deafness, and Renal Dysplasia (HDR) Syndrome J. Biol. Chem., May 21, 2004; 279(21): 22624 - 22634. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. A. Borgono, I. P. Michael, and E. P. Diamandis Human Tissue Kallikreins: Physiologic Roles and Applications in Cancer Mol. Cancer Res., May 1, 2004; 2(5): 257 - 280. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Zhu, S. D. Schlueter, and V. Brendel Refined Annotation of the Arabidopsis Genome by Complete Expressed Sequence Tag Mapping Plant Physiology, June 1, 2003; 132(2): 469 - 484. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Morimoto-Tomita, K. Uchimura, Z. Werb, S. Hemmerich, and S. D. Rosen Cloning and Characterization of Two Extracellular Heparin-degrading Endosulfatases in Mice and Humans J. Biol. Chem., December 13, 2002; 277(51): 49175 - 49185. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. L. Vivian, Y. Chen, D. Yee, E. Schneider, and T. Magnuson An allelic series of mutations in Smad2 and Smad4 identified in a genotype-based screen of N-ethyl-N- nitrosourea-mutagenized mouse embryonic stem cells PNAS, November 26, 2002; 99(24): 15542 - 15547. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Kominato, Y. Hata, H. Takizawa, K. Matsumoto, K. Yasui, J.-i. Tsukada, and F.-i. Yamamoto Alternative Promoter Identified between a Hypermethylated Upstream Region of Repetitive Elements and a CpG Island in Human ABO Histo-blood Group Genes J. Biol. Chem., September 27, 2002; 277(40): 37936 - 37948. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Zavolan, E. van Nimwegen, and T. Gaasterland Splice Variation in Mouse Full-Length cDNAs Identified by Mapping to the Mouse Genome Genome Res., September 1, 2002; 12(9): 1377 - 1385. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Farrer, A. B. Roller, W. J. Kent, and A. M. Zahler Analysis of the role of Caenorhabditis elegans GC-AG introns in regulated splicing Nucleic Acids Res., August 1, 2002; 30(15): 3360 - 3367. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J. Pollard, A. R. Krainer, S. C. Robson, and G. N. Europe-Finner Alternative Splicing of the Adenylyl Cyclase Stimulatory G-protein Galpha s Is Regulated by SF2/ASF and Heterogeneous Nuclear Ribonucleoprotein A1 (hnRNPA1) and Involves the Use of an Unusual TG 3'-Splice Site J. Biol. Chem., May 3, 2002; 277(18): 15241 - 15251. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Altschmied, J. Delfgaauw, B. Wilde, J. Duschl, L. Bouneau, J.-N. Volff, and M. Schartl Subfunctionalization of Duplicate mitf Genes Associated With Differential Degeneration of Alternative Exons in Fish Genetics, May 1, 2002; 161(1): 259 - 267. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-H. Huang, Y.-T. Chen, J.-J. Lai, S.-T. Yang, and U.-C. Yang PALS db: Putative Alternative Splicing database Nucleic Acids Res., January 1, 2002; 30(1): 186 - 190. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Levine and R. Durbin A computational scan for U12-dependent introns in the human genome sequence Nucleic Acids Res., October 1, 2001; 29(19): 4006 - 4013. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. A. Thanaraj and F. Clark Human GC-AG alternative intron isoforms with weak donor sites show enhanced consensus at acceptor exon positions Nucleic Acids Res., June 15, 2001; 29(12): 2581 - 2593. [Abstract] [Full Text] [PDF] |
||||













