Skip Navigation

This Article
Right arrow Print PDF (322K)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Krawetz, S. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Krawetz, S. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 1989, Vol. 17, No. 10 3951-3957
© 1989


MOLECULAR BIOLOGY

Sequence errors described in GenBank: a means to determine the accuracy of DNA sequence interpretation

Stephen A. Krawetz

Department of Molecular Biology and Genetics and The Center for Molecular Biology, Wayne State University 4th Floor MCHT, Laboratory 13, 2727 2nd Avenue, Detroit. MI 48201, USA

Received November 14, 1988. Revised April 18, 1989. Accepted April 18, 1989.

The accuracy of nucleic acid sequence data interpretation was determined by assessing and quantifying the discrepancies reported in the GenBank database. This permitted the calculation of an Error Rate (ER) for nucleic acid sequence determination. If one assumes that most entries (TB, Total Bases) were independently verified or those without reported discrepancies were correct, the ER is 0.368 errors per 1000 bases. However, if one assumes that only those sequences with reported discrepancies (TBIQ, Total Bases from entries In Question) were verified and are thus correct, the ER is 2.887 errors per 1000 bases. This establishes the first set of limit boundaries of the ER for sequence interpretation and sequence errors within the GenBank database and provides the foundation for future assessments and the monitoring of sequence data accumulation. In addition, the ER measure provides a basis to evaluate the efficiency and merit of present and future automated nucleic acid sequencing technologies which will have a direct impact upon the final outcome of the "Human Genome Initiative".


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
ScienceHome page
U. Bhatia, K. Robison, W. Gilbert;, H. Klenk, O. White, and J. C. Venter
Dealing with Database Explosion: A Cautionary Note
Science, June 13, 1997; 276(5319): 1724 - 1725.
[Full Text]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.