Article |
GeneAlign: a coding exon prediction tool based on phylogenetical comparisons
1 Department of Computer Science, National Tsing Hua University Hsinchu, Taiwan 300, ROC 2 Institute of Molecular and Cellular Biology and Department of Life Science, National Tsing Hua University Hsinchu, Taiwan 300, ROC
*To whom correspondence should be addressed. Tel: 886 3 5731077; Fax: 886 3 5723694; Email: cytang{at}cs.nthu.edu.tw
Received February 14, 2006. Revised March 4, 2006. Accepted April 11, 2006.
GeneAlign is a coding exon prediction tool for predicting protein coding genes by measuring the homologies between a sequence of a genome and related sequences, which have been annotated, of other genomes. Identifying protein coding genes is one of most important tasks in newly sequenced genomes. With increasing numbers of gene annotations verified by experiments, it is feasible to identify genes in the newly sequenced genomes by comparing to annotated genes of phylogenetically close organisms. GeneAlign applies CORAL, a heuristic linear time alignment tool, to determine if regions flanked by the candidate signals (initiation codon-GT, AG-GT and AG-STOP codon) are similar to annotated coding exons. Employing the conservation of gene structures and sequence homologies between protein coding regions increases the prediction accuracy. GeneAlign was tested on Projector dataset of 491 humanmouse homologous sequence pairs. At the gene level, both the average sensitivity and the average specificity of GeneAlign are 81%, and they are larger than 96% at the exon level. The rates of missing exons and wrong exons are smaller than 1%. GeneAlign is a free tool available at http://genealign.hccvs.hc.edu.tw.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. Roy, N. Kim, Y. Xing, and C. Lee The effect of intron length on exon creation ratios during the evolution of mammalian genomes RNA, November 1, 2008; 14(11): 2261 - 2273. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Gotoh Direct mapping and alignment of protein sequences onto genomic sequence Bioinformatics, November 1, 2008; 24(21): 2438 - 2444. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Gotoh A space-efficient and accurate method for mapping and aligning cDNA sequences onto genomic sequence Nucleic Acids Res., May 1, 2008; 36(8): 2630 - 2638. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. V. Alekseyenko, N. Kim, and C. J. Lee Global analysis of exon creation versus loss and the role of alternative splicing in 17 vertebrate genomes RNA, May 1, 2007; 13(5): 661 - 670. [Abstract] [Full Text] [PDF] |
||||


