Skip Navigation

This Article
Right arrow Full Text Freely available
Right arrow Print PDF (232K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (8)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Tammi, M. T.
Right arrow Articles by Andersson, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Tammi, M. T.
Right arrow Articles by Andersson, B.
Related Collections
Right arrow Computational methods
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2003, Vol. 31, No. 15 4663-4672
© 2003 Oxford University Press

Correcting errors in shotgun sequences

Martti T. Tammi*, Erik Arner, Ellen Kindlund and Björn Andersson

Center for Genomics and Bioinformatics, Karolinska Institutet, Berzelius väg 35, 171 77 Stockholm, Sweden

*To whom correspondence should be addressed. Tel: +46 8 728 3986; Fax: +46 8 311620; Email: martti.tammi{at}cgb.ki.se

Sequencing errors in combination with repeated regions cause major problems in shotgun sequencing, mainly due to the failure of assembly programs to distinguish single base differences between repeat copies from erroneous base calls. In this paper, a new strategy designed to correct errors in shotgun sequence data using defined nucleotide positions, DNPs, is presented. The method distinguishes single base differences from sequencing errors by analyzing multiple alignments consisting of a read and all its overlaps with other reads. The construction of multiple alignments is performed using a novel pattern matching algorithm, which takes advantage of the symmetry between indices that can be computed for similar words of the same length. This allows for rapid construction of multiple alignments, with no previous pair-wise matching of sequence reads required. Results from a C++ implementation of this method show that up to 99% of sequencing errors can be corrected, while up to 87% of the single base differences remain and up to 80% of the corrected reads contain at most one error. The results also show that the method outperforms the error correction method used in the EULER assembler. The prototype software, MisEd, is freely available from the authors for academic use.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
J. Schroder, H. Schroder, S. J. Puglisi, R. Sinha, and B. Schmidt
SHREC: a short-read error correction method
Bioinformatics, September 1, 2009; 25(17): 2157 - 2163.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
M. J. Chaisson, D. Brinza, and P. A. Pevzner
De novo fragment assembly with short mate-paired reads: Does the read length matter?
Genome Res., February 1, 2009; 19(2): 336 - 346.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
M. J. Chaisson and P. A. Pevzner
Short read fragment assembly of bacterial genomes
Genome Res., February 1, 2008; 18(2): 324 - 330.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
T. Allander, M. T. Tammi, M. Eriksson, A. Bjerkner, A. Tiveljung-Lindell, and B. Andersson
From The Cover: Cloning of a human parvovirus by molecular screening of respiratory tract samples
PNAS, September 6, 2005; 102(36): 12891 - 12896.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. Gajer, M. Schatz, and S. L. Salzberg
Automated correction of genome sequence errors
Nucleic Acids Res., January 26, 2004; 32(2): 562 - 569.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.