Nucleic Acids Research Advance Access originally published online on September 18, 2006
Nucleic Acids Research 2006 34(17):e116; doi:10.1093/nar/gkl601
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2006, Vol. 34, No. 17 e116
© 2006 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Methods Online |
AccuTyping: new algorithms for automated analysis of data from high-throughput genotyping with oligonucleotide microarrays
Department of Molecular Genetics, Microbiology and Immunology, University of Medicine and Dentistry of New Jersey-Robert Wood Johnson Medical School SRB 110, 661 Hoes Lane, Piscataway, NJ 08854, USA
*To whom correspondence should be addressed. Tel: +1 732 235 7330; Fax: +1 732 235 5223; Email: holi{at}umdnj.edu
Received March 9, 2006. Revised July 11, 2006. Accepted July 31, 2006.
Microarray-based analysis of single nucleotide polymorphisms (SNPs) has many applications in large-scale genetic studies. To minimize the influence of experimental variation, microarray data usually need to be processed in different aspects including background subtraction, normalization and low-signal filtering before genotype determination. Although many algorithms are sophisticated for these purposes, biases are still present. In the present paper, new algorithms for SNP microarray data analysis and the software, AccuTyping, developed based on these algorithms are described. The algorithms take advantage of a large number of SNPs included in each assay, and the fact that the top and bottom 20% of SNPs can be safely treated as homozygous after sorting based on their ratios between the signal intensities. These SNPs are then used as controls for color channel normalization and background subtraction. Genotype calls are made based on the logarithms of signal intensity ratios using two cutoff values, which were determined after training the program with a dataset of
160 000 genotypes and validated by non-microarray methods. AccuTyping was used to determine >300 000 genotypes of DNA and sperm samples. The accuracy was shown to be >99%. AccuTyping can be downloaded from http://www2.umdnj.edu/lilabweb/publications/AccuTyping.html.