Nucleic Acids Research Advance Access published online on October 9, 2008
Nucleic Acids Research, doi:10.1093/nar/gkn684
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Database Issue |
ATGC: a database of orthologous genes from closely related prokaryotic genomes and a research platform for microevolution of prokaryotes
1Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, CA 94720, 2Department of Energy Joint Genome Institute, 2800 Mitchell Drive, Walnut Creek, CA 94598 and 3National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Building 38A, 8600 Rockville Pike, Bethesda, MD 20894, USA
*To whom correspondence should be addressed. Email: psnovichkov{at}lbl.gov
Received August 18, 2008. Revised September 22, 2008. Accepted September 23, 2008.
The database of Alignable Tight Genomic Clusters (ATGCs) consists of closely related genomes of archaea and bacteria, and is a resource for research into prokaryotic microevolution. Construction of a data set with appropriate characteristics is a major hurdle for this type of studies. With the current rate of genome sequencing, it is difficult to follow the progress of the field and to determine which of the available genome sets meet the requirements of a given research project, in particular, with respect to the minimum and maximum levels of similarity between the included genomes. Additionally, extraction of specific content, such as genomic alignments or families of orthologs, from a selected set of genomes is a complicated and time-consuming process. The database addresses these problems by providing an intuitive and efficient web interface to browse precomputed ATGCs, select appropriate ones and access ATGC-derived data such as multiple alignments of orthologous proteins, matrices of pairwise intergenomic distances based on genome-wide analysis of synonymous and nonsynonymous substitution rates and others. The ATGC database will be regularly updated following new releases of the NCBI RefSeq. The database is hosted by the Genomics Division at Lawrence Berkeley National laboratory and is publicly available at http://atgc.lbl.gov