Nucleic Acids Research, 2001, Vol. 29, No. 2 351-361
© 2001 Oxford University Press
The estimation of statistical parameters for local alignment score distributions
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA and 1Department of Physics, University of California at San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0319, USA
The distribution of optimal local alignment scores of random sequences plays a vital role in evaluating the statistical significance of sequence alignments. These scores can be well described by an extreme-value distribution. The distributions parameters depend upon the scoring system employed and the random letter frequencies; in general they cannot be derived analytically, but must be estimated by curve fitting. For obtaining accurate parameter estimates, a form of the recently described island method has several advantages. We describe this method in detail, and use it to investigate the functional dependence of these parameters on finite-length edge effects.
* To whom correspondence should be addressed. Tel: +1 301 496 2475; Fax: +1 301 480 9241; Email: altschul{at}ncbi.nlm.nih.gov
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. Gambin and P. Wojtalewicz CTX-BLAST: context sensitive version of protein BLAST Bioinformatics, July 1, 2007; 23(13): 1686 - 1688. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-K. Yu, E. M. Gertz, R. Agarwala, A. A. Schaffer, and S. F. Altschul Retrieval accuracy, statistical significance and compositional similarity in protein sequence database searches Nucleic Acids Res., November 6, 2006; 34(20): 5966 - 5973. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Yu. Mitrophanov and M. Borodovsky Statistical significance in biological sequence analysis Brief Bioinform, March 1, 2006; 7(1): 2 - 24. |
||||
![]() |
T. Wang and G. D. Stormo Identifying the conserved network of cis-regulatory sites of a eukaryotic genome PNAS, November 29, 2005; 102(48): 17400 - 17405. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sheetlin, Y. Park, and J. L. Spouge The Gumbel pre-factor k for gapped local alignment can be estimated from simulations of global alignment Nucleic Acids Res., September 6, 2005; 33(15): 4987 - 4994. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Otzen, D. Wang, M. G. J. Lunenborg, and I. J. van der Klei Hansenula polymorpha Pex20p is an oligomer that binds the peroxisomal targeting signal 2 (PTS2) J. Cell Sci., August 1, 2005; 118(15): 3409 - 3418. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. H. Havgaard, R. B. Lyngso, and J. Gorodkin The FOLDALIGN web server for pairwise structural RNA alignment and mutual motif search Nucleic Acids Res., July 1, 2005; 33(suppl_2): W650 - W653. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Poleksic, J. F. Danzer, K. Hambly, and D. A. Debe Convergent Island Statistics: a fast method for determining local alignment score significance Bioinformatics, June 15, 2005; 21(12): 2827 - 2831. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. H. Havgaard, R. B. Lyngso, G. D. Stormo, and J. Gorodkin Pairwise local structural alignment of RNA sequences with sequence similarity less than 40% Bioinformatics, May 1, 2005; 21(9): 1815 - 1824. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. G. Kann, P. A. Thiessen, A. R. Panchenko, A. A. Schaffer, S. F. Altschul, and S. H. Bryant A structure-based method for protein sequence alignment Bioinformatics, April 15, 2005; 21(8): 1451 - 1456. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-K. Yu and S. F. Altschul The construction of amino acid substitution matrices for the comparison of proteins with non-standard compositions Bioinformatics, April 1, 2005; 21(7): 902 - 911. [Abstract] [Full Text] [PDF] |
||||
![]() |
M.-L. Hanninen, R. I. Karenlampi, J. M. K. Koort, T. Mikkonen, and K. J. Bjorkroth Extension of the species Helicobacter bilis to include the reference strains of Helicobacter sp. flexispira taxa 2, 3 and 8 and Finnish canine and feline flexispira strains Int J Syst Evol Microbiol, March 1, 2005; 55(2): 891 - 898. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-K. Yu, J. C. Wootton, and S. F. Altschul The compositional adjustment of amino acid substitution matrices PNAS, December 23, 2003; 100(26): 15688 - 15693. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Mills, M. Rozanov, A. Lomsadze, T. Tatusova, and M. Borodovsky Improving gene annotation of complete viral genomes Nucleic Acids Res., December 1, 2003; 31(23): 7041 - 7055. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. M. G. Pearl, C. F. Bennett, J. E. Bray, A. P. Harrison, N. Martin, A. Shepherd, I. Sillitoe, J. Thornton, and C. A. Orengo The CATH database: an extended protein family resource for structural and functional genomics Nucleic Acids Res., January 1, 2003; 31(1): 452 - 455. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Schaffer, L. Aravind, T. L. Madden, S. Shavirin, J. L. Spouge, Y. I. Wolf, E. V. Koonin, and S. F. Altschul Improving the accuracy of PSI-BLAST protein database searches with composition-based statistics and other refinements Nucleic Acids Res., July 15, 2001; 29(14): 2994 - 3005. [Abstract] [Full Text] [PDF] |
||||





