Skip Navigation

This Article
Right arrow Print PDF (647K)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (17)
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Lamperti, E. D.
Right arrow Articles by VillaKomaroff, L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lamperti, E. D.
Right arrow Articles by VillaKomaroff, L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 1992, Vol. 20, No. 11 2741-2747
© 1992


MOLECULAR BIOLOGY

Corruption of genomic databases with anomalous sequence

Edward D. Lamperti, J.Matthew Kittelberger, Temple F. Smith1,+ and Lydia VillaKomaroff*

Department of Neurdogy, Children's Hospital, Harvard Medical School Boston, MA 02115, USA 1Molecular Biology Computer Research Resource (MBCRR), Dana-Farber Cancer Institute, Harvard School of Public Health Boston, MA 02115, USA

*To whom correspondence should be addressed at: Department of Neurology, Enders 250, Children's Hospital, 300 Longwood Avenue, Boston, MA 02115, USA

Received March 3, 1992. Revised May 8, 1992. Accepted May 8, 1992.

We describe evidence that DNA sequences from vectors used for cloning and sequencing have been incorporated accidentally into eukaryotic entries in the GenBank database. These incorporations were not restricted to one type of vector or to a single mechanism. Many minor instances may have been the result of simple editing errors, but some entries contained large blocks of vector sequence that had been incorporated by contamination or other accidents during cloning. Some cases involved unusual rearrangements and areas of vector distant from the normal insertion sites. Matches to vector were found in 0.23% of 20,000 sequences analyzed in GenBank Release 63. Although the possibility of anomalous sequence incorporation has been recognized since the inception of GenBank and should be easy to avoid, recent evidence suggests that this problem is increasing more quickly than the database Itself. The presence of anomalous sequence may have serious consequences for the interpretation and use of database entries, and will have an impact on issues of database management. The incorporated vector fragments described here may also be useful for a crude estimate of the fidelity of sequence information in the database. In alignments with well-defined ends, the matching sequences showed 96.8% identity to vector; when poorer matches with arbitrary limits were included, the aggregate identity to vector sequence was 94.8%.


+Present address: MBCRR, Molecular Engineering Research Center, Boston University, 36 Cummington St., Boston, MA 02215, USA


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
GeneticsHome page
C. Liang, Y. Liu, L. Liu, A. C. Davis, Y. Shen, and Q. Q. Li
Expressed Sequence Tags With cDNA Termini: Previously Overlooked Resources for Gene Annotation and Transcriptome Exploration in Chlamydomonas reinhardtii
Genetics, May 1, 2008; 179(1): 83 - 93.
[Abstract] [Full Text] [PDF]


Home page
J. Am. Med. Inform. Assoc.Home page
I. S. Kohane
Bioinformatics and Clinical Informatics: The Imperative to Collaborate
J. Am. Med. Inform. Assoc., September 1, 2000; 7(5): 512 - 516.
[Full Text]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.