Nucleic Acids Research, Vol 24, Issue 17 3439-3452, Copyright © 1996 by Oxford University Press
SM Hebsgaard, PG Korning, N Tolstrup, J Engelbrecht, P Rouze and S Brunak
Artificial neural networks have been combined with a rule based system to
predict intron splice sites in the dicot plant Arabidopsis thaliana. A two
step prediction scheme, where a global prediction of the coding potential
regulates a cutoff level for a local prediction of splice sites, is refined
by rules based on splice site confidence values, prediction scores, coding
context and distances between potential splice sites. In this approach, the
prediction of splice sites mutually affect each other in a non-local
manner. The combined approach drastically reduces the large amount of false
positive splice sites normally haunting splice site prediction. An analysis
of the errors made by the networks in the first step of the method revealed
a previously unknown feature, a frequent T-tract prolongation containing
cryptic acceptor sites in the 5' end of exons. The method presented here
has been compared with three other approaches, GeneFinder, Gene- Mark and
Grail. Overall the method presented here is an order of magnitude better.
We show that the new method is able to find a donor site in the coding
sequence for the jelly fish Green Fluorescent Protein, exactly at the
position that was experimentally observed in A.thaliana transformants.
Predictions for alternatively spliced genes are also presented, together
with examples of genes from other dicots, monocots and algae. The method
has been made available through electronic mail (NetPlantGene@cbs.dtu.dk),
or the WWW at http://www.cbs.dtu.dk/NetPlantGene.html
ARTICLES
Splice site prediction in Arabidopsis thaliana pre-mRNA by combining local and global sequence information
Center for Biological Sequence Analysis, Technical University of Denmark, Lyngby, Denmark.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. R. Paredez, S. Persson, D. W. Ehrhardt, and C. R. Somerville Genetic Evidence That Cellulose Synthase Activity Influences Microtubule Cortical Array Organization Plant Physiology, August 1, 2008; 147(4): 1723 - 1734. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sato, Y. Nakamura, T. Kaneko, E. Asamizu, T. Kato, M. Nakao, S. Sasamoto, A. Watanabe, A. Ono, K. Kawashima, et al. Genome Structure of the Legume, Lotus japonicus DNA Res, May 28, 2008; (2008) dsn008v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. T. Baranick, N. A. Lemp, J. Nagashima, K. Hiraoka, N. Kasahara, and C. R. Logg Splicing mediates the activity of four putative cellular internal ribosome entry sites PNAS, March 25, 2008; 105(12): 4733 - 4738. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. I. Dogan, L. Getoor, W. J. Wilbur, and S. M. Mount SplicePort--An interactive splice-site analysis tool Nucleic Acids Res., July 13, 2007; 35(suppl_2): W285 - W291. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. W. Roy and D. Penny Intron length distributions and gene prediction Nucleic Acids Res., July 9, 2007; 35(14): 4737 - 4742. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Schwarte and H. Bauwe Identification of the Photorespiratory 2-Phosphoglycolate Phosphatase, PGLP1, in Arabidopsis Plant Physiology, July 1, 2007; 144(3): 1580 - 1586. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Kaurah, A. MacMillan, N. Boyd, J. Senz, A. De Luca, N. Chun, G. Suriano, S. Zaor, L. Van Manen, C. Gilpin, et al. Founder and Recurrent CDH1 Mutations in Families With Hereditary Diffuse Gastric Cancer JAMA, June 6, 2007; 297(21): 2360 - 2372. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sato, Y. Nakamura, E. Asamizu, S. Isobe, and S. Tabata Genome Sequencing and Genome Resources in Model Legumes Plant Physiology, June 1, 2007; 144(2): 588 - 593. [Full Text] [PDF] |
||||
![]() |
M. Petz, D. Kozina, H. Huber, T. Siwiec, J. Seipelt, W. Sommergruber, and W. Mikulits The leader region of Laminin B1 mRNA confers cap-independent translation Nucleic Acids Res., April 3, 2007; 35(8): 2473 - 2482. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Georgitsi, A. Raitila, A. Karhu, K. Tuppurainen, M. J. Makinen, O. Vierimaa, R. Paschke, W. Saeger, R. B. van der Luijt, T. Sane, et al. Molecular diagnosis of pituitary adenoma predisposition caused by aryl hydrocarbon receptor-interacting protein gene mutations PNAS, March 6, 2007; 104(10): 4101 - 4105. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Li, L. Ma, H. Li, S. Vang, Y. Hu, L. Bolund, and J. Wang Snap: an integrated SNP annotation platform Nucleic Acids Res., January 12, 2007; 35(suppl_1): D707 - D710. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Gissot, C. Polge, M. Jossier, T. Girin, J.-P. Bouly, M. Kreis, and M. Thomas AKINbeta{gamma} Contributes to SnRK1 Heterotrimeric Complexes and Interacts with Two Proteins Implicated in Plant Pathogen Resistance through Its KIS/GBD Sequence Plant Physiology, November 1, 2006; 142(3): 931 - 944. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Vorechovsky Aberrant 3' splice sites in human disease genes: mutation pattern, nucleotide structure and comparison of computational tools that predict their utilization Nucleic Acids Res., September 15, 2006; (2006) gkl535v2. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Neumann, A. Koblizkova, A. Navratilova, and J. Macas Significant Expansion of Vicia pannonica Genome Size Mediated by Amplification of a Single Type of Giant Retroelement Genetics, June 1, 2006; 173(2): 1047 - 1056. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. D. Town, F. Cheung, R. Maiti, J. Crabtree, B. J. Haas, J. R. Wortman, E. E. Hine, R. Althoff, T. S. Arbogast, L. J. Tallon, et al. Comparative Genomics of Brassica oleracea and Arabidopsis thaliana Reveal Gene Loss, Fragmentation, and Dispersal after Polyploidy PLANT CELL, June 1, 2006; 18(6): 1348 - 1359. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-L. Xiao, S. R. Smith, N. Ishmael, J. C. Redman, N. Kumar, E. L. Monaghan, M. Ayele, B. J. Haas, H. C. Wu, and C. D. Town Analysis of the cDNAs of Hypothetical Genes on Arabidopsis Chromosome 2 Reveals Numerous Transcript Variants Plant Physiology, November 1, 2005; 139(3): 1323 - 1337. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Kralovicova, M. B. Christensen, and I. Vorechovsky Biased exon/intron distribution of cryptic and de novo 3' splice sites Nucleic Acids Res., September 1, 2005; 33(15): 4882 - 4898. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Wu, M. A. Schoenbeck, B. T. Greenhagen, S. Takahashi, S. Lee, R. M. Coates, and J. Chappell Surrogate Splicing for Functional Analysis of Sesquiterpene Synthase Genes Plant Physiology, July 1, 2005; 138(3): 1322 - 1333. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. A.T. Silverstein, M. A. Graham, T. D. Paape, and K. A. VandenBosch Genome Organization of More Than 300 Defensin-Like Genes in Arabidopsis Plant Physiology, June 1, 2005; 138(2): 600 - 610. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. B. Cannon, J. A. Crow, M. L. Heuer, X. Wang, E. K.S. Cannon, C. Dwan, A.-F. Lamblin, J. Vasdewani, J. Mudge, A. Cook, et al. Databases and Information Integration for the Medicago truncatula Genome and Transcriptome Plant Physiology, May 1, 2005; 138(1): 38 - 46. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Degroeve, Y. Saeys, B. De Baets, P. Rouze, and Y. Van de Peer SpliceMachine: predicting splice sites from high-dimensional local context representations Bioinformatics, April 15, 2005; 21(8): 1332 - 1338. [Abstract] [Full Text] [PDF] |
||||
![]() |
T.-M. Chen, C.-C. Lu, and W.-H. Li Prediction of splice sites with dependency graphs and their expanded bayesian networks Bioinformatics, February 15, 2005; 21(4): 471 - 482. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. L. Cooper and S. Henikoff Adaptive Evolution of the Histone Fold Domain in Centromeric Histones Mol. Biol. Evol., September 1, 2004; 21(9): 1712 - 1718. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Graham, K. A.T. Silverstein, S. B. Cannon, and K. A. VandenBosch Computational Identification and Characterization of Novel Genes from Legumes Plant Physiology, July 1, 2004; 135(3): 1179 - 1197. [Abstract] [Full Text] [PDF] |
||||
![]() |
A.-E. Guitton, D. R. Page, P. Chambrier, C. Lionnet, J.-E. Faure, U. Grossniklaus, and F. Berger Identification of new members of Fertilisation Independent Seed Polycomb Group pathway involved in the control of seed development in Arabidopsis thaliana Development, June 15, 2004; 131(12): 2971 - 2981. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Castelli, J.-M. Aury, O. Jaillon, P. Wincker, C. Clepet, M. Menard, C. Cruaud, F. Quetier, C. Scarpelli, V. Schachter, et al. Whole Genome Sequence Comparisons and "Full-Length" cDNA Sequences: A Combined Approach to Evaluate and Improve Arabidopsis Genome Annotation Genome Res., March 1, 2004; 14(3): 406 - 413. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. M. Stellari, M. A. Jaramillo, and E. M. Kramer Evolution of the APETALA3 and PISTILLATA Lineages of MADS-Box-Containing Genes in the Basal Angiosperms Mol. Biol. Evol., March 1, 2004; 21(3): 506 - 519. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Eden and S. Brunak Analysis and recognition of 5' UTR intron splice sites in human pre-mRNA Nucleic Acids Res., February 11, 2004; 32(3): 1131 - 1142. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Zhang, N. Jiang, C. Feschotte, and S. R. Wessler PIF- and Pong-Like Transposable Elements: Distribution, Evolution and Relationship With Tourist-Like Miniature Inverted-Repeat Transposable Elements Genetics, February 1, 2004; 166(2): 971 - 986. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Zhang and L. Luo Splice site prediction with quadratic discriminant analysis using diversity measure Nucleic Acids Res., November 1, 2003; 31(21): 6214 - 6220. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. J. Pritham, Y. H. Zhang, C. Feschotte, and R. V. Kesseli An Ac-like Transposable Element Family With Transcriptionally Active Y-Linked Copies in the White Campion, Silene latifolia Genetics, October 1, 2003; 165(2): 799 - 807. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Toledo-Ortiz, E. Huq, and P. H. Quail The Arabidopsis Basic/Helix-Loop-Helix Transcription Factor Family PLANT CELL, August 1, 2003; 15(8): 1749 - 1770. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. R. Wortman, B. J. Haas, L. I. Hannick, R. K. Smith Jr., R. Maiti, C. M. Ronning, A. P. Chan, C. Yu, M. Ayele, C. A. Whitelaw, et al. Annotation of the Arabidopsis Genome Plant Physiology, June 1, 2003; 132(2): 461 - 468. [Full Text] [PDF] |
||||
![]() |
B. Agashe, C. K. Prasad, and I. Siddiqi Identification and analysis of DYAD: a gene required for meiotic chromosome organisation and female meiotic progression in Arabidopsis Development, March 10, 2003; 129(16): 3935 - 3943. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Roudier, E. Fedorova, M. Lebris, P. Lecomte, J. Gyorgyey, D. Vaubert, G. Horvath, P. Abad, A. Kondorosi, and E. Kondorosi The Medicago Species A2-Type Cyclin Is Auxin Regulated and Involved in Meristem Formation But Dispensable for Endoreduplication-Associated Developmental Programs Plant Physiology, March 1, 2003; 131(3): 1091 - 1103. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Knappe, U.-I. Flugge, and K. Fischer Analysis of the Plastidic phosphate translocator Gene Family in Arabidopsis and Identification of New phosphate translocator-Homologous Transporters, Classified by Their Putative Substrate-Binding Site Plant Physiology, March 1, 2003; 131(3): 1178 - 1190. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Goubet, A. Misrahi, S. K. Park, Z. Zhang, D. Twell, and P. Dupree AtCSLA7, a Cellulose Synthase-Like Putative Glycosyltransferase, Is Important for Pollen Tube Growth and Embryogenesis in Arabidopsis Plant Physiology, February 1, 2003; 131(2): 547 - 557. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Samson, F. Legeai, E. Karsenty, S. Reboux, J.-B. Veyrieras, J. Just, and E. Barillot GenoPlante-Info (GPI): a collection of databases and bioinformatics resources for plant genomics Nucleic Acids Res., January 1, 2003; 31(1): 179 - 182. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Pandey, A. Muller, C. A. Napoli, D. A. Selinger, C. S. Pikaard, E. J. Richards, J. Bender, D. W. Mount, and R. A. Jorgensen Analysis of histone acetyltransferase and histone deacetylase families of Arabidopsis thaliana suggests functional diversification of chromatin modification among multicellular eukaryotes Nucleic Acids Res., December 1, 2002; 30(23): 5036 - 5055. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. W. Roberts, E. M. Roberts, and D. P. Delmer Cellulose Synthase (CesA) Genes in the Green Alga Mesotaenium caldariorum Eukaryot. Cell, December 1, 2002; 1(6): 847 - 855. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. Graham, L. F. Marek, and R. C. Shoemaker Organization, Expression and Evolution of a Disease Resistance Gene Cluster in Soybean Genetics, December 1, 2002; 162(4): 1961 - 1977. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-L. Xiao, M. Malik, C. A. Whitelaw, and C. D. Town Cloning and Sequencing of cDNAs for Hypothetical Genes from Chromosome 2 of Arabidopsis Plant Physiology, December 1, 2002; 130(4): 2118 - 2128. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Mathe, M.-F. Sagot, T. Schiex, and P. Rouze Current methods of gene prediction, their strengths and weaknesses Nucleic Acids Res., October 1, 2002; 30(19): 4103 - 4117. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Bouche, A. Scharlat, W. Snedden, D. Bouchez, and H. Fromm A Novel Family of Calmodulin-binding Transcription Activators in Multicellular Organisms J. Biol. Chem., June 7, 2002; 277(24): 21851 - 21861. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Feschotte and S. R. Wessler Mariner-like transposases are widespread and diverse in flowering plants PNAS, December 21, 2001; (2001) 22626699. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. M. Paxson-Sowders, C. H. Dodrill, H. A. Owen, and C. A. Makaroff DEX1, a Novel Plant Protein, Is Required for Exine Pattern Formation during Pollen Development in Arabidopsis Plant Physiology, December 1, 2001; 127(4): 1739 - 1749. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. C. Baldwin, M. G. Handford, M.-I. Yuseff, A. Orellana, and P. Dupree Identification and Characterization of GONST1, a Golgi-Localized GDP-Mannose Transporter in Arabidopsis PLANT CELL, October 1, 2001; 13(10): 2283 - 2295. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Zhang, C. Feschotte, Q. Zhang, N. Jiang, W. B. Eggleston, and S. R. Wessler P instability factor: An active maize transposon system associated with the amplification of Tourist-like MITEs and a new superfamily of transposases PNAS, September 26, 2001; (2001) 211442198. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Hong, A. J. Delauney, and D. P. S. Verma A Cell Plate-Specific Callose Synthase and Its Interaction with Phragmoplastin PLANT CELL, April 1, 2001; 13(4): 755 - 768. [Abstract] [Full Text] |
||||
![]() |
M. Pertea, X. Lin, and S. L. Salzberg GeneSplicer: a new computational method for splice site prediction Nucleic Acids Res., March 1, 2001; 29(5): 1185 - 1190. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. F. Quiros, F. Grellet, J. Sadowski, T. Suzuki, G. Li, and T. Wroblewski Arabidopsis and Brassica Comparative Genomics: Sequence, Structure and Gene Content in the ABI1-Rps2-Ck1 Chromosomal Segment and Related Regions Genetics, March 1, 2001; 157(3): 1321 - 1330. [Abstract] [Full Text] |
||||
![]() |
J. A. Sullivan and J. C. Gray The Pea light-independent photomorphogenesis1 Mutant Results from Partial Duplication of COP1 Generating an Internal Promoter and Producing Two Distinct Transcripts PLANT CELL, October 1, 2000; 12(10): 1927 - 1938. [Abstract] [Full Text] |
||||
![]() |
L. Comai, A. P. Tyagi, K. Winter, R. Holmes-Davis, S. H. Reynolds, Y. Stevens, and B. Byers Phenotypic Instability and Rapid Gene Silencing in Newly Formed Arabidopsis Allotetraploids PLANT CELL, September 1, 2000; 12(9): 1551 - 1568. [Abstract] [Full Text] |
||||
![]() |
A. Dam, J. M. Fock, V. M. Hayes, W. M. Molenaar, and E. van den Berg Recurrent astrocytoma in a child: A report of cytogenetics and TP53 gene mutation screening Neuro-oncol, July 1, 2000; 2(3): 184 - 189. [Abstract] [PDF] |
||||
![]() |
P. M. Sanders, P. Y. Lee, C. Biesgen, J. D. Boone, T. P. Beals, E. W. Weiler, and R. B. Goldberg The Arabidopsis DELAYED DEHISCENCE1 Gene Encodes an Enzyme in the Jasmonic Acid Synthesis Pathway PLANT CELL, July 1, 2000; 12(7): 1041 - 1062. [Abstract] [Full Text] |
||||
![]() |
U. Wittstock and B. A. Halkier Cytochrome P450 CYP79A2 from Arabidopsis thaliana L. Catalyzes the Conversion of L-Phenylalanine to Phenylacetaldoxime in the Biosynthesis of Benzylglucosinolate J. Biol. Chem., May 5, 2000; 275(19): 14659 - 14666. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Feschotte and C. Mouches Evidence that a Family of Miniature Inverted-Repeat Transposable Elements (MITEs) from the Arabidopsis thaliana Genome Has Arisen from a pogo-like DNA Transposon Mol. Biol. Evol., May 1, 2000; 17(5): 730 - 737. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Leiter, J. Mucha, E. Staudacher, R. Grimm, J. Glossl, and F. Altmann Purification, cDNA Cloning, and Expression of GDP-L-Fuc:Asn-linked GlcNAc alpha 1,3-Fucosyltransferase from Mung Beans J. Biol. Chem., July 30, 1999; 274(31): 21830 - 21839. [Abstract] [Full Text] [PDF] |
||||
|
|













