Skip Navigation

This Article
Right arrow Full Text Freely available
Right arrow Print PDF (815K) Freely available
Right arrow Supplementary Material
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (11)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Mills, R.
Right arrow Articles by Borodovsky, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Mills, R.
Right arrow Articles by Borodovsky, M.
Related Collections
Right arrow Computational methods
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2003, Vol. 31, No. 23 7041-7055
© 2003 Oxford University Press


Article

Improving gene annotation of complete viral genomes

Ryan Mills1, Michael Rozanov3, Alexandre Lomsadze1, Tatiana Tatusova3 and Mark Borodovsky*,1,2

1 School of Biology and 2 School of Biomedical Engineering, Georgia Institute of Technology, Atlanta, GA 30332-0230, USA and 3 National Center for Biotechnology Information, National Institutes of Health, Bethesda, MD 20894, USA

*To whom correspondence should be addressed. Tel: +1 404 894 8432; Fax: +1 404 894 0519; Email: mark.borodovsky{at}biology.gatech.edu

Gene annotation in viruses often relies upon similarity search methods. These methods possess high specificity but some genes may be missed, either those unique to a particular genome or those highly divergent from known homologs. To identify potentially missing viral genes we have analyzed all complete viral genomes currently available in GenBank with a specialized and augmented version of the gene finding program GeneMarkS. In particular, by implementing genome-specific self-training protocols we have better adjusted the GeneMarkS statistical models to sequences of viral genomes. Hundreds of new genes were identified, some in well studied viral genomes. For example, a new gene predicted in the genome of the Epstein–Barr virus was shown to encode a protein similar to {alpha}-herpesvirus minor tegument protein UL14 with heat shock functions. Convincing evidence of this similarity was obtained after only 12 PSI-BLAST iterations. In another example, several iterations of PSI-BLAST were required to demonstrate that a gene predicted in the genome of Alcelaphine herpesvirus 1 encodes a BALF1-like protein which is thought to be involved in apoptosis regulation and, potentially, carcinogenesis. New predictions were used to refine annotations of viral genomes in the RefSeq collection curated by the National Center for Biotechnology Information. Importantly, even in those cases where no sequence similarities were detected, GeneMarkS significantly reduced the number of primary targets for experimental characterization by identifying the most probable candidate genes. The new genome annotations were stored in VIOLIN, an interactive database which provides access to similarity search tools for up-to-date analysis of predicted viral proteins.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
S. McCauley, S. de Groot, T. Mailund, and J. Hein
Annotation of selection strengths in viral genomes
Bioinformatics, November 15, 2007; 23(22): 2978 - 2986.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. de Groot, T. Mailund, and J. Hein
Comparative annotation of viral genomes with non-conserved gene structure
Bioinformatics, May 1, 2007; 23(9): 1080 - 1089.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Besemer and M. Borodovsky
GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses
Nucleic Acids Res., July 1, 2005; 33(suppl_2): W451 - W454.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
L. M. Kattenhorn, R. Mills, M. Wagner, A. Lomsadze, V. Makeev, M. Borodovsky, H. L. Ploegh, and B. M. Kessler
Identification of Proteins Associated with Murine Cytomegalovirus Virions
J. Virol., October 15, 2004; 78(20): 11187 - 11197.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
G. Resch, E. M. Kulik, F. S. Dietrich, and J. Meyer
Complete Genomic Nucleotide Sequence of the Temperate Bacteriophage Aa{Phi}23 of Actinobacillus actinomycetemcomitans
J. Bacteriol., August 15, 2004; 186(16): 5523 - 5528.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
Y. Bao, S. Federhen, D. Leipe, V. Pham, S. Resenchuk, M. Rozanov, R. Tatusov, and T. Tatusova
National Center for Biotechnology Information Viral Genomes Project
J. Virol., July 15, 2004; 78(14): 7291 - 7298.
[Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.