Nucleic Acids Research Advance Access published online on May 18, 2009
Nucleic Acids Research, doi:10.1093/nar/gkp381
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Methods Online |
Identification of candidate regulatory SNPs by combination of transcription-factor-binding site prediction, SNP genotyping and haploChIP
1The Linnaeus Centre for Bioinformatics, 2Department of Genetics and Pathology, Rudbeck Laboratory, Uppsala University, Sweden and 3Interdisciplinary Centre for Mathematical and Computer Modelling, Warsaw University, Poland
*To whom the correspondence should be addressed. Tel: +0046739246433; Fax: +0046184716698; Email: alvaro.rada{at}lcb.uu.se
Correspondence may also be addressed to Claes Wadelius. Tel: +0046184714076; Fax: +0046184714808; Email: claes.wadelius{at}genpat.uu.se
Present address: Alvaro Rada-Iglesias, The Linnaeus Centre for Bioinformatics, Uppsala University, Sweden.
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.
Received December 4, 2008. Revised April 24, 2009. Accepted April 27, 2009.
Disease-associated SNPs detected in large-scale association studies are frequently located in non-coding genomic regions, suggesting that they may be involved in transcriptional regulation. Here we describe a new strategy for detecting regulatory SNPs (rSNPs), by combining computational and experimental approaches. Whole genome ChIP-chip data for USF1 was analyzed using a novel motif finding algorithm called BCRANK. 1754 binding sites were identified and 140 candidate rSNPs were found in the predicted sites. For validating their regulatory function, seven SNPs found to be heterozygous in at least one of four human cell samples were investigated by ChIP and sequence analysis (haploChIP). In four of five cases where the SNP was predicted to affect binding, USF1 was preferentially bound to the allele containing the consensus motif. Allelic differences in binding for other proteins and histone marks further reinforced the SNPs regulatory potential. Moreover, for one of these SNPs, H3K36me3 and POLR2A levels at neighboring heterozygous SNPs indicated effects on transcription. Our strategy, which is entirely based on in vivo data for both the prediction and validation steps, can identify individual binding sites at base pair resolution and predict rSNPs. Overall, this approach can help to pinpoint the causative SNPs in complex disorders where the associated haplotypes are located in regulatory regions. Availability: BCRANK is available from Bioconductor (http://www.bioconductor.org/).