Nucleic Acids Research Advance Access originally published online on September 20, 2007
Nucleic Acids Research 2007 35(18):e122; doi:10.1093/nar/gkm648
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2007, Vol. 35, No. 18 e122
© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Methods Online |
SAGETTARIUS: a program to reduce the number of tags mapped to multiple transcripts and to plan SAGE sequencing stages
1Plate-forme Bioinformatique de Strasbourg, Institut de Génétique et de Biologie Moléculaire et Cellulaire (CNRS/INSERM/ULP) BP 163, 67404 Illkirch Cedex, 2Inserm U682, Strasbourg, Laboratoire de Biochimie - Biologie Moléculaire, CHU Strasbourg - Hôpital de Hautepierre and 3Laboratoire de Bioinformatique et de Génomique Intégratives, Institut de Génétique et de Biologie Moléculaire et Cellulaire (CNRS/INSERM/ULP) BP 163, 67404 Illkirch Cedex, France
*To whom correspondence should be addressed. Tel: +33 388653271; Fax: +33 388653201; Email: Laurent.Bianchetti{at}igbmc.u-strasbg.fr
Received May 18, 2007. Revised July 20, 2007. Accepted August 6, 2007.
SAGE (Serial Analysis of Gene Expression) experiments generate short nucleotide sequences called tags which are assumed to map unambiguously to their original transcripts (1 tag to 1 transcript mapping). Nevertheless, many tags are generated that do not map to any transcript or map to multiple transcripts. Current bioinformatics resources, such as SAGEmap and TAGmapper, have focused on reducing the number of unmapped tags. Here, we describe SAGETTARIUS, a new high-throughput program that performs successive precise Nla3 and Sau3A tag to transcript mapping, based on specifically designed Virtual Tag (VT) libraries. First, SAGETTARIUS decreases the number of tags mapped to multiple transcripts. Among the various mapping resources compared, SAGETTARIUS performed the best in this respect by decreasing up to 11% the number of multiply mapped tags. Second, SAGETTARIUS allows the establishment of a guideline for SAGE experiment sequencing efforts through efficient mapping of the CRT (Cytoplasmic Ribosomal protein Transcripts)-specific tags. Using all publicly available human and mouse Nla3 SAGE experiments, we show that sequencing 100 000 tags is sufficient to map almost all CRT-specific tags and that four sequencing stages can be identified when carrying out a human or mouse SAGE project. SAGETTARIUS is web interfaced and freely accessible to academic users.