Skip Navigation



Nucleic Acids Research Advance Access published online on February 20, 2008

Nucleic Acids Research, doi:10.1093/nar/gkn064
This Article
Right arrow Full Text Freely available
Right arrow Print PDF (150K) Freely available
Right arrow Screen PDF (164K) Freely available
Right arrowOA All Versions of this Article:
36/7/2284    most recent
gkn064v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Citing Articles
Right arrowScopus Links
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Saha, S.
Right arrow Articles by Peterson, D. G.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Saha, S.
Right arrow Articles by Peterson, D. G.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2008 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.


Computational Biology

Empirical comparison of ab initio repeat finding programs

Surya Saha1,2,3, Susan Bridges1,3, Zenaida V. Magbanua2,3,4 and Daniel G. Peterson2,3,4,*

1Department of Computer Science and Engineering, 2Mississippi Genome Exploration Laboratory, 3Institute for Digital Biology and 4Department of Plant & Soil Sciences, Mississippi State University, Mississippi State, MS 39762, USA

*To whom correspondence should be addressed. Tel: +1 662 325 2747; Fax: +1 662 325 8742; Email: dpeterson{at}pss.msstate.edu

Received November 5, 2007. Revised January 30, 2008. Accepted January 31, 2008.

Identification of dispersed repetitive elements can be difficult, especially when elements share little or no homology with previously described repeats. Consequently, a growing number of computational tools have been designed to identify repetitive elements in an ab initio manner, i.e. without using prior sequence data. Here we present the results of side-by-side evaluations of six of the most widely used ab initio repeat finding programs. Using sequence from rice chromosome 12, tools were compared with regard to time requirements, ability to find known repeats, utility in identifying potential novel repeats, number and types of repeat elements recognized and compactness of family descriptions. The study reveals profound differences in the utility of the tools with some identifying virtually their entire substrate as repetitive, others making reasonable estimates of repetition, and some missing almost all repeats. Of note, even when tools recognized similar numbers of repeats they often showed marked differences in the nature and number of repeat families identified. Within the context of this comparative study, ReAS and RepeatScout showed the most promise in analysis of sequence reads and assembled genomic regions, respectively. Our results should help biologists identify the program(s), if any, that is best suited for their needs.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Gen Biol EvolHome page
C. Feschotte, U. Keswani, N. Ranganathan, M. L. Guibotsy, and D. Levine
Exploring Repetitive DNA Landscapes Using REPCLASS, a Tool That Automates the Classification of Transposable Elements in Eukaryotic Genomes
Gen Biol Evol, March 1, 2010; 2009(0): 205 - 220.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
V. Becher, A. Deymonnaz, and P. Heiber
Efficient computation of all perfect repeats in genomic sequences of up to half a gigabyte, with a case study on the human genome
Bioinformatics, July 15, 2009; 25(14): 1746 - 1753.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.