Nucleic Acids Research Advance Access published online on March 13, 2007
Nucleic Acids Research, doi:10.1093/nar/gkm081
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Genomics |
EST assembly supported by a draft genome sequence: an analysis of the Chlamydomonas reinhardtii transcriptome
1The Carnegie Institution, Department of Plant Biology, 260 Panama Street, Stanford, CA 94305, USA, 2Biology Department, Duke University, Durham, NC 27708, USA, 3Institut de Biologie Physico-Chimique, UMR7141 CNRS/Université Pierre et Marie Curie-Paris6, 13 Rue Pierre et Marie Curie, 75005 Paris, France and 4St. Edwards University, Department of Biology, Austin, TX 78704, USA
*To whom correspondence should be addressed. Tel: +33 1 5841 5058; Fax: +33 1 5841 5022; Email: ovallon{at}ibpc.fr
Received December 21, 2006. Accepted January 26, 2007.
Clustering and assembly of expressed sequence tags (ESTs) constitute the basis for most genomewide descriptions of a transcriptome. This approach is limited by the decline in sequence quality toward the end of each EST, impacting both sequence clustering and assembly. Here, we exploit the available draft genome sequence of the unicellular green alga Chlamydomonas reinhardtii to guide clustering and to correct errors in the ESTs. We have grouped all available EST and cDNA sequences into 12 063 ACEGs (assembly of contiguous ESTs based on genome) and generated 15 857 contigs of average length 934 nt. We predict that roughly 3000 of our contigs represent full-length transcripts. Compared to previous assemblies, ACEGs show extended contig length, increased accuracy and a reduction in redundancy. Because our assembly protocol also uses ESTs with no corresponding genomic sequences, it provides sequence information for genes interrupted by sequence gaps. Detailed analysis of randomly sampled ACEGs reveals several hundred putative cases of alternative splicing, many overlapping transcription units and new genes not identified by gene prediction algorithms. Our protocol, although developed for and tailored to the C. reinhardtii dataset, can be exploited by any eukaryotic genome project for which both a draft genome sequence and ESTs are available.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
C. Liang, Y. Liu, L. Liu, A. C. Davis, Y. Shen, and Q. Q. Li Expressed Sequence Tags With cDNA Termini: Previously Overlooked Resources for Gene Annotation and Transcriptome Exploration in Chlamydomonas reinhardtii Genetics, May 1, 2008; 179(1): 83 - 93. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. S. Merchant, S. E. Prochnik, O. Vallon, E. H. Harris, S. J. Karpowicz, G. B. Witman, A. Terry, A. Salamov, L. K. Fritz-Laylin, L. Marechal-Drouard, et al. The Chlamydomonas Genome Reveals the Evolution of Key Animal and Plant Functions Science, October 12, 2007; 318(5848): 245 - 250. [Abstract] [Full Text] [PDF] |
||||

