Nucleic Acids Research, 1984, Vol. 12, No. 13 5529-5543
© 1984
MOLECULAR BIOLOGY |
On the statistical assessment of similarities in DNA sequences
Central Institute of Molecular Biology, Academy of Sciences of the GDR, Robert-Rössle-Str 10, DDR-1115 Berlin-Buch, GDR
+To whom correspondence be addressed
Received January 27, 1984. Revised May 31, 1984. Accepted June 11, 1984.
The statistical behavior of the similarity score for unrelated DNA sequences calculated as letter-by-letter comparison or from various forms of optimal alignment was studied. It was found that natural DNA-sequences from a data base and true random sequences show the same statistical behavior in terms of such scores. This makes it possible to adopt a simple criterion for the rejection of fortuitous similarity. It is based on the mean and standard deviation of chance scores whose expected values, depending on chain length, gap penalty and probability of letter coincidence, may be calculated from formulae given in the paper.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
A. Yu. Mitrophanov and M. Borodovsky Statistical significance in biological sequence analysis Brief Bioinform, March 1, 2006; 7(1): 2 - 24. |
||||
![]() |
S Karlin and V Brendel Chance and statistical significance in protein and DNA sequence analysis Science, July 3, 1992; 257(5066): 39 - 49. [Abstract] [PDF] |
||||

