Nucleic Acids Research Advance Access published online on September 18, 2009
Nucleic Acids Research, doi:10.1093/nar/gkp747
Gene Regulation, Chromatin and Epigenetics |
Primary sequence and epigenetic determinants of in vivo occupancy of genomic DNA by GATA1
1Center for Comparative Genomics and Bioinformatics, Huck Institutes of Life Sciences, 2Graduate Programs in Genetics, 3Graduate Programs in Cell and Developmental Biology, 4Department of Biochemistry and Molecular Biology, 5Graduate Programs in Bioinformatics and Genomics, The Pennsylvania State University, University Park, Pennsylvania, PA 16802, 6Department of Biology, Emory University, Atlanta, GA 30333 and 7Department of Statistics, The Pennsylvania State University, University Park, Pennsylvania, PA 16802, USA
*To whom correspondence should be addressed. Tel: +1 814 863 0113; Fax: +1 814 863 7024; Email: rch8{at}psu.edu
Received July 11, 2009. Revised August 22, 2009. Accepted August 24, 2009.
DNA sequence motifs and epigenetic modifications contribute to specific binding by a transcription factor, but the extent to which each feature determines occupancy in vivo is poorly understood. We addressed this question in erythroid cells by identifying DNA segments occupied by GATA1 and measuring the level of trimethylation of histone H3 lysine 27 (H3K27me3) and monomethylation of H3 lysine 4 (H3K4me1) along a 66 Mb region of mouse chromosome 7. While 91% of the GATA1-occupied segments contain the consensus binding-site motif WGATAR, only
0.7% of DNA segments with such a motif are occupied. Using a discriminative motif enumeration method, we identified additional motifs predictive of occupancy given the presence of WGATAR. The specific motif variant AGATAA and occurrence of multiple WGATAR motifs are both strong discriminators. Combining motifs to pair a WGATAR motif with a binding site motif for GATA1, EKLF or SP1 improves discriminative power. Epigenetic modifications are also strong determinants, with the factor-bound segments highly enriched for H3K4me1 and depleted of H3K27me3. Combining primary sequence and epigenetic determinants captures 52% of the GATA1-occupied DNA segments and substantially increases the specificity, to one out of seven segments with the required motif combination and epigenetic signals being bound.