Nucleic Acids Research, 2000, Vol. 28, No. 3 744-754
© 2000 Oxford University Press
Positional characterisation of false positives from computational prediction of human splice sites
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
The performance of computational tools that can predict human splice sites are reviewed using a test set of EST-confirmed splice sites. The programs (namely HMMgene, NetGene2, HSPL, NNSPLICE, SpliceView and GeneID-3) differ from one another in the degree of discriminatory information used for prediction. The results indicate that, as expected, HMMgene and NetGene2 (which use global as well as local coding information and splice signals) followed by HSPL (which uses local coding information and splice signals) performed better than the other three programs (which use only splice signals). For the former three programs, one in every three false positive splice sites was predicted in the vicinity of true splice sites while only one in every 12 was expected to occur in such a region by chance. The persistence of this observation for programs (namely FEXH, GRAIL2, MZEF, GeneID-3, HMMgene and GENSCAN) that can predict all the potential exons (including optimal and sub-optimal) was assessed. In a high proportion (>50%) of the partially correct predicted exons, the incorrect exon ends were located in the vicinity of the real splice sites. Analysis of the distribution of proximal false positives indicated that the splice signals used by the algorithms are not strong enough to discriminate particularly those false predictions that occur within ± 25 nt around the real sites. It is therefore suggested that specialised statistics that can discriminate real splice sites from proximal false positives be incorporated in gene prediction programs.
* Tel: +44 1223 494650; Fax: +44 1223 494468; Email: thanaraj@ebi.ac.uk
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
E. Buratti, M. Chivers, J. Kralovicova, M. Romano, M. Baralle, A. R. Krainer, and I. Vorechovsky Aberrant 5' splice sites in human disease genes: mutation pattern, nucleotide structure and comparison of computational tools that predict their utilization Nucleic Acids Res., July 26, 2007; 35(13): 4250 - 4263. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Vorechovsky Aberrant 3' splice sites in human disease genes: mutation pattern, nucleotide structure and comparison of computational tools that predict their utilization Nucleic Acids Res., September 15, 2006; (2006) gkl535v2. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Sheth, X. Roca, M. L. Hastings, T. Roeder, A. R. Krainer, and R. Sachidanandam Comprehensive splice-site analysis using comparative genomics Nucleic Acids Res., September 1, 2006; 34(14): 3955 - 3967. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. ROCA, R. SACHIDANANDAM, and A. R. KRAINER Determinants of the inherent strength of human 5' splice sites RNA, May 1, 2005; 11(5): 683 - 698. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Kochiwa, R. Suzuki, T. Washio, R. Saito, T. R. G. E. R. G. Phase II Team, H. Bono, P. Carninci, Y. Okazaki, R. Miki, Y. Hayashizaki, et al. Inferring Alternative Splicing Patterns in Mouse from a Full-Length cDNA Library and Microarray Data Genome Res., August 1, 2002; 12(8): 1286 - 1293. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. A. Thanaraj and F. Clark Human GC-AG alternative intron isoforms with weak donor sites show enhanced consensus at acceptor exon positions Nucleic Acids Res., June 15, 2001; 29(12): 2581 - 2593. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Hirosawa, K.-i. Ishikawa, T. Nagase, and O. Ohara Detection of Spurious Interruptions of Protein-Coding Regions in Cloned cDNA Sequences by GeneMark Analysis Genome Res., September 1, 2000; 10(9): 1333 - 1341. [Abstract] [Full Text] |
||||


