Nucleic Acids Research, 2003, Vol. 31, No. 19 5654-5666
© 2003 Oxford University Press
Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies
The Institute for Genomic Research, 9712 Medical Center Drive, Rockville, MD 20850, USA, 1 Department of Cell Biology and Molecular Genetics, University of Maryland, College Park, MD 20742, USA and 2 The Center for Advancement of Genomics, 1901 Research Boulevard, Rockville, MD 20850, USA
*To whom correspondence should be addressed. Tel: +1 301 610 5988; Fax: +1 301 838 0208; Email: bhaas{at}tigr.org
The spliced alignment of expressed sequence data to genomic sequence has proven a key tool in the comprehensive annotation of genes in eukaryotic genomes. A novel algorithm was developed to assemble clusters of overlapping transcript alignments (ESTs and full-length cDNAs) into maximal alignment assemblies, thereby comprehensively incorporating all available transcript data and capturing subtle splicing variations. Complete and partial gene structures identified by this method were used to improve The Institute for Genomic Research Arabidopsis genome annotation (TIGR release v.4.0). The alignment assemblies permitted the automated modeling of several novel genes and >1000 alternative splicing variations as well as updates (including UTR annotations) to nearly half of the
27 000 annotated protein coding genes. The algorithm of the Program to Assemble Spliced Alignments (PASA) tool is described, as well as the results of automated updates to Arabidopsis gene annotations.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. Stanke, M. Diekhans, R. Baertsch, and D. Haussler Using native and syntenically mapped cDNA alignments to improve de novo gene finding Bioinformatics, March 1, 2008; 24(5): 637 - 644. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Swarbreck, C. Wilks, P. Lamesch, T. Z. Berardini, M. Garcia-Hernandez, H. Foerster, D. Li, T. Meyer, R. Muller, L. Ploetz, et al. The Arabidopsis Information Resource (TAIR): gene structure and function annotation Nucleic Acids Res., January 11, 2008; 36(suppl_1): D1009 - D1014. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Siepel, M. Diekhans, B. Brejova, L. Langton, M. Stevens, C. L.G. Comstock, C. Davis, B. Ewing, S. Oommen, C. Lau, et al. Targeted discovery of novel human exons by comparative genomics Genome Res., December 1, 2007; 17(12): 1763 - 1773. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Balasenthil, A. E. Gururaj, A. H. Talukder, R. Bagheri-Yarmand, T. Arrington, B. J. Haas, J. C. Braisted, I. Kim, N. H. Lee, and R. Kumar Identification of Pax5 as a Target of MTA1 in B-Cell Lymphomas Cancer Res., August 1, 2007; 67(15): 7132 - 7138. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sato, Y. Nakamura, E. Asamizu, S. Isobe, and S. Tabata Genome Sequencing and Genome Resources in Model Legumes Plant Physiology, June 1, 2007; 144(2): 588 - 593. [Full Text] [PDF] |
||||
![]() |
W. Zhu and C. R. Buell Improvement of whole-genome annotation of cereals through comparative analyses Genome Res., March 1, 2007; 17(3): 299 - 310. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Ouyang, W. Zhu, J. Hamilton, H. Lin, M. Campbell, K. Childs, F. Thibaud-Nissen, R. L. Malek, Y. Lee, L. Zheng, et al. The TIGR Rice Genome Annotation Resource: improvements and new features Nucleic Acids Res., January 12, 2007; 35(suppl_1): D883 - D887. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Rokas, G. Payne, N.D. Fedorova, S.E. Baker, M. Machida, J. Yu, D. R. Georgianna, R. A. Dean, D. Bhatnagar, T.E. Cleveland, et al. What can comparative genomics tell us about species concepts in the genus Aspergillus? Stud Mycol, January 1, 2007; 59(1): 11 - 17. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Xing, T. Yu, Y. N. Wu, M. Roy, J. Kim, and C. Lee An expectation-maximization algorithm for probabilistic reconstructions of full-length isoforms from splice graphs Nucleic Acids Res., June 6, 2006; 34(10): 3150 - 3160. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Thill, V. Castelli, S. Pallud, M. Salanoubat, P. Wincker, P. de la Grange, D. Auboeuf, V. Schachter, and J. Weissenbach ASEtrap: A biological method for speeding up the exploration of spliceomes Genome Res., June 1, 2006; 16(6): 776 - 786. [Abstract] [Full Text] [PDF] |
||||
![]() |
B.-B. Wang and V. Brendel Genomewide comparative analysis of alternative splicing in plants PNAS, May 2, 2006; 103(18): 7175 - 7180. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Florea Bioinformatics of alternative splicing and its regulation Brief Bioinform, March 1, 2006; 7(1): 55 - 69. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Ner-Gaon and R. Fluhr Whole-Genome Microarray in Arabidopsis Facilitates Global Analysis of Retained Introns DNA Res, January 1, 2006; 13(3): 111 - 121. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-L. Xiao, S. R. Smith, N. Ishmael, J. C. Redman, N. Kumar, E. L. Monaghan, M. Ayele, B. J. Haas, H. C. Wu, and C. D. Town Analysis of the cDNAs of Hypothetical Genes on Arabidopsis Chromosome 2 Reveals Numerous Transcript Variants Plant Physiology, November 1, 2005; 139(3): 1323 - 1337. [Abstract] [Full Text] [PDF] |
||||
![]() |
The Rice Chromosome 3 Sequencing Consortium Sequence, annotation, and analysis of synteny between rice chromosome 3 and diverged grass species Genome Res., September 1, 2005; 15(9): 1284 - 1291. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. C. Loke, E. A. Stahlberg, D. G. Strenski, B. J. Haas, P. C. Wood, and Q. Q. Li Compilation of mRNA Polyadenylation Signals in Arabidopsis Revealed a New Signal Element and Potential Secondary Structures Plant Physiology, July 1, 2005; 138(3): 1457 - 1468. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Sharov, D. B. Dudekula, and M. S.H. Ko Genome-wide assembly and analysis of alternative transcripts in mouse Genome Res., May 1, 2005; 15(5): 748 - 754. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. D. Wu and C. K. Watanabe GMAP: a genomic mapping and alignment program for mRNA and EST sequences Bioinformatics, May 1, 2005; 21(9): 1859 - 1875. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. Yuan, S. Ouyang, A. Wang, W. Zhu, R. Maiti, H. Lin, J. Hamilton, B. Haas, R. Sultana, F. Cheung, et al. The Institute for Genomic Research Osa1 Rice Genome Annotation Database Plant Physiology, May 1, 2005; 138(1): 18 - 26. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. B. Cannon, J. A. Crow, M. L. Heuer, X. Wang, E. K.S. Cannon, C. Dwan, A.-F. Lamblin, J. Vasdewani, J. Mudge, A. Cook, et al. Databases and Information Integration for the Medicago truncatula Genome and Transcriptome Plant Physiology, May 1, 2005; 138(1): 38 - 46. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Ayele, B. J. Haas, N. Kumar, H. Wu, Y. Xiao, S. Van Aken, T. R. Utterback, J. R. Wortman, O. R. White, and C. D. Town Whole genome shotgun sequencing of Brassica oleracea and its application to gene discovery and annotation in Arabidopsis Genome Res., April 1, 2005; 15(4): 487 - 495. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. S. Katari, V. Balija, R. K. Wilson, R. A. Martienssen, and W. R. McCombie Comparing low coverage random shotgun sequence data from Brassica oleracea and Oryza sativa genome sequence for their ability to add to the annotation of Arabidopsis thaliana Genome Res., April 1, 2005; 15(4): 496 - 504. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Florea, V. Di Francesco, J. Miller, R. Turner, A. Yao, M. Harris, B. Walenz, C. Mobarry, G. V. Merkulov, R. Charlab, et al. Gene and alternative splicing annotation with AIR Genome Res., January 1, 2005; 15(1): 54 - 66. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Aubourg, V. Brunaud, C. Bruyere, M. Cock, R. Cooke, A. Cottet, A. Couloux, P. Dehais, G. Deleage, A. Duclert, et al. GeneFarm, structural and functional annotation of Arabidopsis gene and protein families by a network of experts Nucleic Acids Res., January 1, 2005; 33(suppl_1): D641 - D646. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. Robinson, D. J. Cram, C. T. Lewis, and I. A.P. Parkin Maximizing the Efficacy of SAGE Analysis Identifies Novel Transcripts in Arabidopsis Plant Physiology, October 1, 2004; 136(2): 3223 - 3233. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Iida, M. Seki, T. Sakurai, M. Satou, K. Akiyama, T. Toyoda, A. Konagaya, and K. Shinozaki Genome-wide analysis of alternative pre-mRNA splicing in Arabidopsis thaliana based on full-length cDNA sequences Nucleic Acids Res., September 27, 2004; 32(17): 5096 - 5103. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Suzuki, Y. Nakano, A. Yoshida, Y. Yamashita, and Y. Kiyoura Real-Time TaqMan PCR for Quantifying Oral Bacteria during Biofilm Formation J. Clin. Microbiol., August 1, 2004; 42(8): 3827 - 3830. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Sorek, R. Shemesh, Y. Cohen, O. Basechess, G. Ast, and R. Shamir A Non-EST-Based Method for Exon-Skipping Prediction Genome Res., August 1, 2004; 14(8): 1617 - 1623. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. C. Meyers, S. S. Tej, T. H. Vu, C. D. Haudenschild, V. Agrawal, S. B. Edberg, H. Ghazal, and S. Decola The Use of MPSS for Whole-Genome Transcriptional Analysis in Arabidopsis Genome Res., August 1, 2004; 14(8): 1641 - 1653. [Abstract] [Full Text] [PDF] |
||||
![]() |
W.A. Rensink and C. R. Buell Arabidopsis to Rice. Applying Knowledge from a Weed to Enhance Our Understanding of a Crop Species Plant Physiology, June 1, 2004; 135(2): 622 - 629. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. C. Meyers, D. K. Lee, T. H. Vu, S. S. Tej, S. B. Edberg, M. Matvienko, and L. D. Tindell Arabidopsis MPSS. An Online Resource for Quantitative Expression Analysis Plant Physiology, June 1, 2004; 135(2): 801 - 813. [Full Text] [PDF] |
||||
![]() |
B. C. Meyers, D. W. Galbraith, T. Nelson, and V. Agrawal Methods for Transcriptional Profiling in Plants. Be Fruitful and Replicate Plant Physiology, June 1, 2004; 135(2): 637 - 652. [Full Text] [PDF] |
||||
![]() |
V. Castelli, J.-M. Aury, O. Jaillon, P. Wincker, C. Clepet, M. Menard, C. Cruaud, F. Quetier, C. Scarpelli, V. Schachter, et al. Whole Genome Sequence Comparisons and "Full-Length" cDNA Sequences: A Combined Approach to Evaluate and Improve Arabidopsis Genome Annotation Genome Res., March 1, 2004; 14(3): 406 - 413. [Abstract] [Full Text] [PDF] |
||||









