Skip Navigation

This Article
Right arrow Full Text Freely available
Right arrow Print PDF (144K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (46)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Madera, M.
Right arrow Articles by Gough, J.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Madera, M.
Right arrow Articles by Gough, J.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2002, Vol. 30, No. 19 4321-4328
© 2002 Oxford University Press

A comparison of profile hidden Markov model procedures for remote homology detection

Martin Madera* and Julian Gough

MRC Laboratory of Molecular Biology, Hills Road, Cambridge CB2 2QH, UK

*To whom correspondence should be addressed. Tel: +44 1223 402479; Fax: +44 1223 213556; Email: mm238{at}mrc-lmb.cam.ac.uk
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors

Profile hidden Markov models (HMMs) are amongst the most successful procedures for detecting remote homology between proteins. There are two popular profile HMM programs, HMMER and SAM. Little is known about their performance relative to each other and to the recently improved version of PSI-BLAST. Here we compare the two programs to each other and to non-HMM methods, to determine their relative performance and the features that are important for their success. The quality of the multiple sequence alignments used to build models was the most important factor affecting the overall performance of profile HMMs. The SAM T99 procedure is needed to produce high quality alignments automatically, and the lack of an equivalent component in HMMER makes it less complete as a package. Using the default options and parameters as would be expected of an inexpert user, it was found that from identical alignments SAM consistently produces better models than HMMER and that the relative performance of the model-scoring components varies. On average, HMMER was found to be between one and three times faster than SAM when searching databases larger than 2000 sequences, SAM being faster on smaller ones. Both methods were shown to have effective low complexity and repeat sequence masking using their null models, and the accuracy of their E-values was comparable. It was found that the SAM T99 iterative database search procedure performs better than the most recent version of PSI-BLAST, but that scoring of PSI-BLAST profiles is more than 30 times faster than scoring of SAM models.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
Y. Loewenstein and M. Linial
Connect the dots: exposing hidden protein family connections from the entire sequence tree
Bioinformatics, August 15, 2008; 24(16): i193 - i199.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Stojmirovic, E. M. Gertz, S. F. Altschul, and Y.-K. Yu
The effectiveness of position- and composition-specific gap costs for protein similarity searches
Bioinformatics, July 1, 2008; 24(13): i15 - i23.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. J. Reid, C. Yeats, and C. A. Orengo
Methods of remote homology detection can be combined to increase coverage by 10% in the midnight zone
Bioinformatics, September 15, 2007; 23(18): 2353 - 2360.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
S. Hochreiter, M. Heusel, and K. Obermayer
Fast model-based protein homology detection without alignment
Bioinformatics, July 15, 2007; 23(14): 1728 - 1736.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. G. Kann, S. L. Sheetlin, Y. Park, S. H. Bryant, and J. L. Spouge
The identification of complete domains within protein sequences using accurate E-values for semi-global alignment
Nucleic Acids Res., July 9, 2007; 35(14): 4678 - 4685.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
C. D. Fjell, R. E.W. Hancock, and A. Cherkasov
AMPer: a database and an automated discovery tool for antimicrobial peptides
Bioinformatics, May 1, 2007; 23(9): 1148 - 1155.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
D. Wilson, M. Madera, C. Vogel, C. Chothia, and J. Gough
The SUPERFAMILY database in 2007: families and functions
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D308 - D313.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
A. Kutchma, N. Quayum, and J. Jensen
GeneSpeed: protein domain organization of the transcriptome
Nucleic Acids Res., January 12, 2007; 35(suppl_1): D674 - D679.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
E. K. Freyhult, J. P. Bollback, and P. P. Gardner
Exploring genomic dark matter: A critical assessment of the performance of homology search methods on noncoding RNA
Genome Res., January 1, 2007; 17(1): 117 - 125.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
T. Friedrich, B. Pils, T. Dandekar, J. Schultz, and T. Muller
Modelling interaction sites in protein domains with interaction profile hidden Markov models
Bioinformatics, December 1, 2006; 22(23): 2851 - 2857.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Gough
Genomic scale sub-family assignment of protein domains
Nucleic Acids Res., July 28, 2006; 34(13): 3625 - 3633.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
V. Neduva and R. B. Russell
DILIMOT: discovery of linear motifs in proteins.
Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W350 - W355.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Cheng and P. Baldi
A machine learning information retrieval approach to protein fold recognition
Bioinformatics, June 15, 2006; 22(12): 1456 - 1463.
[Abstract] [Full Text] [PDF]


Home page
J. Gen. Physiol.Home page
A. A. Fodor and R. W. Aldrich
Statistical Limits to the Identification of Ion Channel Domains by Sequence Similarity
J. Gen. Physiol., May 30, 2006; 127(6): 755 - 766.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. R. Johnston and D. C. Shields
A sequence sub-sampling algorithm increases the power to detect distant homologues
Nucleic Acids Res., July 8, 2005; 33(12): 3772 - 3778.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
O. Krishnadev, N. Rekha, S. B. Pandit, S. Abhiman, S. Mohanty, L. S. Swapna, S. Gore, and N. Srinivasan
PRODOC: a resource for the comparison of tethered protein domain architectures with in-built information on remotely related domain families
Nucleic Acids Res., July 1, 2005; 33(suppl_2): W126 - W129.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
I. Sillitoe, M. Dibley, J. Bray, S. Addou, and C. Orengo
Assessing strategies for improved superfamily recognition
Protein Sci., July 1, 2005; 14(7): 1800 - 1810.
[Abstract] [Full Text] [PDF]


Home page
Genome ResHome page
I. Ovcharenko, D. Boffelli, and G. G. Loots
eShadow: A Tool for Comparing Closely Related Sequences
Genome Res., June 1, 2004; 14(6): 1191 - 1198.
[Abstract] [Full Text] [PDF]


Home page
Protein Sci.Home page
M. A. Marti-Renom, M.S. Madhusudhan, and A. Sali
Alignment of protein sequences by their profiles
Protein Sci., April 1, 2004; 13(4): 1071 - 1087.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Madera, C. Vogel, S. K. Kummerfeld, C. Chothia, and J. Gough
The SUPERFAMILY database in 2004: additions and improvements
Nucleic Acids Res., January 1, 2004; 32(90001): D235 - 239.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
N. O. Stitziel, T. A. Binkowski, Y. Y. Tseng, S. Kasif, and J. Liang
topoSNP: a topographic database of non-synonymous single nucleotide polymorphisms with and without known disease association
Nucleic Acids Res., January 1, 2004; 32(90001): D520 - 522.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. Madan Babu and S. A. Teichmann
Evolution of transcription factors and the gene regulatory network in Escherichia coli
Nucleic Acids Res., February 15, 2003; 31(4): 1234 - 1244.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.