Skip Navigation

This Article
Right arrow Full Text Freely available
Right arrow Print PDF (239K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (9)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Unneberg, P.
Right arrow Articles by Larsson, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Unneberg, P.
Right arrow Articles by Larsson, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2003, Vol. 31, No. 8 2217-2226
© 2003 Oxford University Press

Transcript identification by analysis of short sequence tags—influence of tag length, restriction site and transcript database

Per Unneberg, Anders Wennborg1 and Magnus Larsson

Department of Biotechnology, Royal Institute of Technology (KTH), Roslagsvägen 30B, S-106 91 Stockholm, Sweden and 1 Department of Biosciences, Karolinska Institute, Novum, S-141 57 Huddinge, Sweden

*To whom correspondence should be addressed. Tel: +46 8 5537 8347; Fax: +46 8 5537 8481; Email: peru{at}biotech.kth.se

There exist a number of gene expression profiling techniques that utilize restriction enzymes for generation of short expressed sequence tags. We have studied how the choice of restriction enzyme influences various characteristics of tags generated in an experiment. We have also investigated various aspects of in silico transcript identification that these profiling methods rely on. First, analysis of 14 248 mRNA sequences derived from the RefSeq transcript database showed that 1–30% of the sequences lack a given restriction enzyme recognition site. Moreover, 1–5% of the transcripts have recognition sites located less than 10 bases from the poly(A) tail. The uniqueness of 10 bp tags lies in the range 90–95%, which increases only slightly with longer tags, due to the existence of closely related transcripts. Furthermore, 3–30% of upstream 10 bp tags are identical to 3' tags, introducing a risk of misclassification if upstream tags are present in a sample. Second, we found that a sequence length of 16–17 bp, including the recognition site, is sufficient for unique transcript identification by BLAST based sequence alignment to the UniGene Human non-redundant database. Third, we constructed a tag-to-gene mapping for UniGene and compared it to an existing mapping database. The mappings agreed to 79–83%, where the selection of representative sequences in the UniGene clusters is the main cause of the disagreement. The results of this study may serve to improve the interpretation of sequence-based expression studies and the design of hybridization arrays, by identifying short tags that have a high reliability and separating them from tags that carry an inherent ambiguity in their capacity to discriminate between genes. To this end, supplementary information in the form of a web companion to this paper is located at http:// biobase.biotech.kth.se/tagseq.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
A. Fischer, A. Lenhard, H. Tronecker, Y. Lorat, M. Kraenzle, O. Sorgenfrei, T. Zeppenfeld, M. Haushalter, G. Vogt, U. Gruene, et al.
iGentifier: indexing and large-scale profiling of unknown transcriptomes
Nucleic Acids Res., July 9, 2007; 35(14): 4640 - 4648.
[Abstract] [Full Text] [PDF]


Home page
Plant Physiol.Home page
V. Poroyko, L.G. Hejlek, W.G. Spollen, G.K. Springer, H.T. Nguyen, R.E. Sharp, and H.J. Bohnert
The Maize Root Transcriptome by Serial Analysis of Gene Expression
Plant Physiology, July 1, 2005; 138(3): 1700 - 1710.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. Tengs, T. LaFramboise, R. B. Den, D. N. Hayes, J. Zhang, S. DebRoy, R. C. Gentleman, K. O'Neill, B. Birren, and M. Meyerson
Genomic representations using concatenates of Type IIB restriction endonuclease digestion fragments
Nucleic Acids Res., August 25, 2004; 32(15): e121 - e121.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.