Nucleic Acids Research, Vol 25, Issue 8 1626-1632, Copyright © 1997 by Oxford University Press
TG Wolfsberg and D Landsman
The Expressed Sequence Tag (EST) division of GenBank, dbEST, is a large
repository of the data being generated by human genome sequencing centers.
ESTs are short, single pass cDNA sequences generated from randomly selected
library clones. The approximately 415 000 human ESTs represent a valuable,
low priced, and easily accessible biological reagent. As many ESTs are
derived from yet uncharacterized genes, dbEST is a prime starting point for
the identification of novel mRNAs. Conversely, other genes are represented
by hundreds of ESTs, a redundancy which may provide data about rare mRNA
isoforms. Here we present an analysis of >1000 ESTs generated by the
WashU-Merck EST project. These ESTs were collected by querying dbEST with
the genomic sequences of 15 human genes. When we aligned the matching ESTs
to the genomic sequences, we found that in one gene, 73% of the ESTs which
derive from spliced or partially spliced transcripts either contain intron
sequences or are spliced at previously unreported sites; other genes have
lower percentages of such ESTs, and some have none. This finding suggests
that ESTs could provide researchers with novel information about
alternative splicing in certain genes. In a related analysis of pairs of
ESTs which are reported to derive from a single gene, we found that as many
as 26% of the pairs do not BOTH align with the sequence of the same gene.
We suspect that some of these unusual ESTs result from artifacts in EST
generation, and caution researchers that they may find such clones while
analyzing sequences in dbEST.
ARTICLES
A comparison of expressed sequence tags (ESTs) to human genomic sequences
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, Room 8N-807, Bethesda, MD 20894, USA.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
R. C. Georg and S. L. Gomes Transcriptome Analysis in Response to Heat Shock and Cadmium in the Aquatic Fungus Blastocladiella emersonii Eukaryot. Cell, June 1, 2007; 6(6): 1053 - 1062. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. R. Brent Genome annotation past, present, and future: How to define an ORF at each locus Genome Res., December 1, 2005; 15(12): 1777 - 1786. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. Dixon, I. C. Eperon, L. Hall, and N. J. Samani A genome-wide survey demonstrates widespread non-linear mRNA in expressed sequences from multiple species Nucleic Acids Res., October 19, 2005; 33(18): 5904 - 5913. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. J. Haas, A. L. Delcher, S. M. Mount, J. R. Wortman, R. K. Smith Jr, L. I. Hannick, R. Maiti, C. M. Ronning, D. B. Rusch, C. D. Town, et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies Nucleic Acids Res., October 1, 2003; 31(19): 5654 - 5666. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Balasenthil and R. K. Vadlamudi Functional Interactions between the Estrogen Receptor Coactivator PELP1/MNAR and Retinoblastoma Protein J. Biol. Chem., June 6, 2003; 278(24): 22119 - 22127. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Sorek and H. M. Safer A novel algorithm for computational identification of contaminated EST libraries Nucleic Acids Res., February 1, 2003; 31(3): 1067 - 1074. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Collins, M. E. Goward, C. G. Cole, L. J. Smink, E. J. Huckle, S. Knowles, J. M. Bye, D. M. Beare, and I. Dunham Reevaluating Human Gene Annotation: A Second-Generation Analysis of Chromosome 22 Genome Res., January 1, 2003; 13(1): 27 - 36. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Kan, D. States, and W. Gish Selecting for Functional Alternative Splices in ESTs Genome Res., December 1, 2002; 12(12): 1837 - 1845. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. C. Rouchka, W. Gish, and D. J. States Comparison of whole genome assemblies of the human genome Nucleic Acids Res., November 15, 2002; 30(22): 5004 - 5014. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Kochiwa, R. Suzuki, T. Washio, R. Saito, T. R. G. E. R. G. Phase II Team, H. Bono, P. Carninci, Y. Okazaki, R. Miki, Y. Hayashizaki, et al. Inferring Alternative Splicing Patterns in Mouse from a Full-Length cDNA Library and Microarray Data Genome Res., August 1, 2002; 12(8): 1286 - 1293. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Hartmann, L. Johnk, G. Kitange, Y. Wu, L. K. Ashworth, R. B. Jenkins, and D. N. Louis Transcript Map of the 3.7-Mb D19S112-D19S246 Candidate Tumor Suppressor Region on the Long Arm of Chromosome 19 Cancer Res., July 15, 2002; 62(14): 4100 - 4108. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. E. Thompson, P. K. Rogan, J. I. Risinger, and J. A. Taylor Splice Variants but not Mutations of DNA Polymerase {beta} Are Common in Bladder Cancer Cancer Res., June 1, 2002; 62(11): 3251 - 3256. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Skrabanek and F. Campagne TissueInfo: high-throughput identification of tissue expression profiles and specificity Nucleic Acids Res., November 1, 2001; 29(21): e102 - e102. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. K. Hu, S. J Madore, B. Moldover, T. Jatkoe, D. Balaban, J. Thomas, and Y. Wang Predicting Splice Variant from DNA Chip Expression Data Genome Res., July 1, 2001; 11(7): 1237 - 1245. [Abstract] [Full Text] [PDF] |
||||
![]() |
R.-F. Yeh, L. P. Lim, and C. B. Burge Computational Inference of Homologous Gene Structures in the Human Genome Genome Res., May 1, 2001; 11(5): 803 - 816. [Abstract] [Full Text] |
||||
![]() |
Z. Kan, E. C. Rouchka, W. R. Gish, and D. J. States Gene Structure Prediction and Alternative Splicing Analysis Using Genomically Aligned ESTs Genome Res., May 1, 2001; 11(5): 889 - 900. [Abstract] [Full Text] |
||||
![]() |
J. Andrews, G. G. Bouffard, C. Cheadle, J. Lü, K. G. Becker, and B. Oliver Gene Discovery Using Computational and Microarray Analysis of Transcription in the Drosophila melanogaster Testis Genome Res., December 1, 2000; 10(12): 2030 - 2043. [Abstract] [Full Text] |
||||
![]() |
M. Hirosawa, K.-i. Ishikawa, T. Nagase, and O. Ohara Detection of Spurious Interruptions of Protein-Coding Regions in Cloned cDNA Sequences by GeneMark Analysis Genome Res., September 1, 2000; 10(9): 1333 - 1341. [Abstract] [Full Text] |
||||
![]() |
W. J. Kent and A. M. Zahler The Intronerator: exploring introns and alternative splicing in Caenorhabditis elegans Nucleic Acids Res., January 1, 2000; 28(1): 91 - 93. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Mironov, J. W. Fickett, and M. S. Gelfand Frequent Alternative Splicing of Human Genes Genome Res., December 1, 1999; 9(12): 1288 - 1293. [Abstract] [Full Text] |
||||
![]() |
R. T. Miller, A. G. Christoffels, C. Gopalakrishnan, J. Burke, A. A. Ptitsyn, T. R. Broveak, and W. A. Hide A Comprehensive Approach to Clustering of Expressed Human Gene Sequence: The Sequence Tag Alignment and Consensus Knowledge Base Genome Res., November 1, 1999; 9(11): 1143 - 1155. [Abstract] [Full Text] |
||||
![]() |
R. M. Ewing, A. B. Kahla, O. Poirot, F. Lopez, S. Audic, and J.-M. Claverie Large-Scale Statistical Analyses of Rice ESTs Reveal Correlated Patterns of Gene Expression Genome Res., October 1, 1999; 9(10): 950 - 959. [Abstract] [Full Text] |
||||
![]() |
M. Mao, G. Fu, J.-S. Wu, Q.-H. Zhang, J. Zhou, L.-X. Kan, Q.-H. Huang, K.-L. He, B.-W. Gu, Z.-G. Han, et al. Identification of genes expressed in human CD34+ hematopoietic stem/progenitor cells by expressed sequence tags and efficient full-length cDNA cloning PNAS, July 7, 1998; 95(14): 8175 - 8180. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. C. Bailey Jr., D. B. Searls, and G. C. Overton Analysis of EST-Driven Gene Annotation in Human Genomic Sequence Genome Res., April 1, 1998; 8(4): 362 - 376. [Abstract] [Full Text] |
||||
![]() |
J. Jiang and H. J. Jacob EbEST: An Automated Tool Using Expressed Sequence Tags to Delineate Gene Structure Genome Res., March 1, 1998; 8(3): 268 - 275. [Abstract] [Full Text] |
||||
![]() |
J. Burke, H. Wang, W. Hide, and D. B. Davison Alternative Gene Form Discovery and Candidate Gene Selection from Gene Indexing Projects Genome Res., March 1, 1998; 8(3): 276 - 290. [Abstract] [Full Text] |
||||
![]() |
W. A. Hide, V. N. Babenko, P. A. van Heusden, C. Seoighe, and J. F. Kelso The Contribution of Exon-Skipping Events on Chromosome 22 to Protein Coding Diversity Genome Res., November 1, 2001; 11(11): 1848 - 1853. [Abstract] [Full Text] [PDF] |
||||





