Skip Navigation



Nucleic Acids Research Advance Access published online on June 27, 2007

Nucleic Acids Research, doi:10.1093/nar/gkm414
This Article
Right arrow Full Text Freely available
Right arrow Print PDF (1276K) Freely available
Right arrow Screen PDF (367K) Freely available
Right arrow Supplementary Material
Right arrowOA All Versions of this Article:
35/14/4678    most recent
gkm414v2
gkm414v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Kann, M. G.
Right arrow Articles by Spouge, J. L.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kann, M. G.
Right arrow Articles by Spouge, J. L.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.


Computational Biology

The identification of complete domains within protein sequences using accurate E-values for semi-global alignment

Maricel G. Kann, Sergey L. Sheetlin, Yonil Park, Stephen H. Bryant and John L. Spouge*

National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Department of Health and Human Services, Bethesda, MD 20894, USA

*To whom correspondence should be addressed. Tel: 301 402 9310; Fax: 301 480 2484; Email: spouge{at}ncbi.nlm.nih.gov

Received December 14, 2006. Revised May 3, 2007. Accepted May 4, 2007.

The sequencing of complete genomes has created a pressing need for automated annotation of gene function. Because domains are the basic units of protein function and evolution, a gene can be annotated from a domain database by aligning domains to the corresponding protein sequence. Ideally, complete domains are aligned to protein subsequences, in a ‘semi-global alignment’. Local alignment, which aligns pieces of domains to subsequences, is common in high-throughput annotation applications, however. It is a mature technique, with the heuristics and accurate E-values required for screening large databases and evaluating the screening results. Hidden Markov models (HMMs) provide an alternative theoretical framework for semi-global alignment, but their use is limited because they lack heuristic acceleration and accurate E-values. Our new tool, GLOBAL, overcomes some limitations of previous semi-global HMMs: it has accurate E-values and the possibility of the heuristic acceleration required for high-throughput applications. Moreover, according to a standard of truth based on protein structure, two semi-global HMM alignment tools (GLOBAL and HMMer) had comparable performance in identifying complete domains, but distinctly outperformed two tools based on local alignment. When searching for complete protein domains, therefore, GLOBAL avoids disadvantages commonly associated with HMMs, yet maintains their superior retrieval performance.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
J. H. Fong and A. Marchler-Bauer
CORAL: aligning conserved core regions across domain families
Bioinformatics, August 1, 2009; 25(15): 1862 - 1868.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Hertel, D. de Jong, M. Marz, D. Rose, H. Tafer, A. Tanzer, B. Schierwater, and P. F. Stadler
Non-coding RNA annotation of the genome of Trichoplax adhaerens
Nucleic Acids Res., April 1, 2009; 37(5): 1602 - 1615.
[Abstract] [Full Text] [PDF]



Disclaimer: Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.