Nucleic Acids Research Advance Access originally published online on October 26, 2006
Nucleic Acids Research 2006 34(20):5966-5973; doi:10.1093/nar/gkl731
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2006, Vol. 34, No. 20 5966-5973
Published by Oxford University Press 2006
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Computational Biology |
Retrieval accuracy, statistical significance and compositional similarity in protein sequence database searches
National Center for Biotechnology Information, National Library of Medicine NIH, DHHS, Bethesda, MD 20894, USA
*To whom correspondence should be addressed. Tel: +301 435 7803; Fax: +301 480 2288; Email: altschul{at}ncbi.nlm.nih.gov
Received July 27, 2006. Revised September 15, 2006. Accepted September 21, 2006.
Protein sequence database search programs may be evaluated both for their retrieval accuracythe ability to separate meaningful from chance similaritiesand for the accuracy of their statistical assessments of reported alignments. However, methods for improving statistical accuracy can degrade retrieval accuracy by discarding compositional evidence of sequence relatedness. This evidence may be preserved by combining essentially independent measures of alignment and compositional similarity into a unified measure of sequence similarity. A version of the BLAST protein database search program, modified to employ this new measure, outperforms the baseline program in both retrieval and statistical accuracy on ASTRAL, a SCOP-based test set.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
K. Forslund and E. L. Sonnhammer Benchmarking homology detection procedures with low complexity filters Bioinformatics, October 1, 2009; 25(19): 2500 - 2505. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Agrawal and X. Huang PSIBLAST_PairwiseStatSig: reordering PSI-BLAST hits using pairwise statistical significance Bioinformatics, April 15, 2009; 25(8): 1082 - 1083. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Stojmirovic, E. M. Gertz, S. F. Altschul, and Y.-K. Yu The effectiveness of position- and composition-specific gap costs for protein similarity searches Bioinformatics, July 1, 2008; 24(13): i15 - i23. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. I. Sadreyev and N. V. Grishin Accurate statistical model of comparison between multiple sequence alignments Nucleic Acids Res., April 1, 2008; 36(7): 2240 - 2248. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Chen and L. Kurgan PFRES: protein fold classification by using evolutionary information and predicted secondary structure Bioinformatics, November 1, 2007; 23(21): 2843 - 2850. [Abstract] [Full Text] [PDF] |
||||

