ABSTRACT
A new method is described for cloning DNA sequences occupied by a specific protein on chromatin in vivo. The approach uses UV cross-linking to couple proteins covalently to DNA and the resulting complexes are then purified under stringent conditions. Particular adducts are immunoprocipitated with antibody to the protein of interest. The resulting DNA (iDNA) is amplified by PCR, cloned and characterized. The model system used was RNA polymerase II (Pol II), whose density on particular DNAs under various conditions is well documented. Pol II can exist in several states on DNA. While Pol II can simply be bound to DNA, the bulk of DNA-associated Pol II is transcriptionally engaged in either the transcribing or paused states. Paused Pol IIs that have previously been characterized are found at promoters and have the distinctive property that their transcription in isolated nuclei is stimulated by sarkosyl or high salt. Here we isolate and sequence DNAs that cross-link to Pol II molecules. We identify by nuclear run-on assays those DNAs that have Pol II engaged in transcription. Twenty one percent of the iDNA clones that have detectable transcriptionally engaged Pol II appear to be paused, in that they display sarkosyl-stimulated trancription in a nuclear run-on transcription assay. At least some of these map to the 5'-ends of genes. These results suggest that transcriptional pausing of Pol II is a general phenomenon in vivo.
Chromosomal DNA is covered with thousands of proteins that execute a broad spectrum of nuclear functions. Some of these proteins bind DNA with strong sequence specificity, others bind DNA non-specifically, still others bind DNA not on their own but by cooperating with other proteins of a complex and, finally, others bind and translocate along the DNA. Identifying where and at what level a protein interacts with DNA is critical in assessing its function. While binding assays in vitro permit examination of protein-DNA contacts in biochemical detail, methods that quantitatively examine specific protein-DNA interactions in vivo are critical in assessing their biological significance.
Well-documented strategies exist for analytically measuring the location and density of a protein on specific segments of DNA in vivo (1-8). Typically a protein can be cross-linked to DNA in cells and antibody to the protein of interest used to immunoprecipite protein-DNA adducts. The abundance of co-precipitated specific DNAs is quantified by hybridization or PCR methods to provide a measure of the relative density of the protein on particular DNA fragments. Early versions of these protocols used UV light as the cross-linking agent (2,3) and more recent versions have made use of formaldehyde as the cross-linker (4,6,8). Each approach has its advantages, e.g. flashes of UV light rapidly penetrate cells and allow kinetic analysis of proteins that are in direct contact with DNA (9), while formaldehyde is a more general cross-linker and works with a broad array of chromosomal proteins (4,5). Although both approaches have been used analytically to measure a protein on specific regions of DNA, the DNA sequences examined are limited to suspected target DNAs. In principle, the cross-linked DNAs that co-purify with a particular protein can be used to identify new DNA targets of the protein in vivo. Generating cross-links of protein to DNA in vivo ensures that proteins are binding true DNA targets and not artifacts of rearrangements that can occur on chromosomes during cell lysis or further handling of chromatin.
Here we describe an approach for cloning DNA sequences with which a specific protein associates in vivo. We developed this approach with RNA polymerase II (Pol II), which we have shown previously can cross-link efficiently to DNA when cells are irradiated with UV light (10). The distribution of Pol II on many genes is well defined by quantitative studies of transcription (11), providing a strong test of whether the new approach is working in a predictable manner. Also, non-transcribing Pol II has been found on the promoters of some genes, which can be detected by nuclear run-on assays done in the presence of sarkosyl or high salt (12,13). These Pol IIs are also cross-linked to DNA by UV light (2) and the DNAs associated with polymerase should therefore be clonable by this new approach.
One of the best characterized paused Pol II complexes is at the start of the Drosophila hsp70 gene. In the absence of heat shock a transcriptionally engaged Pol II molecule pauses 21-35 bp from the transcription start site of hsp70 (14,15). A survey of Drosophila genes showed evidence of paused Pol II on several but not all genes (13). Pol II pausing has also been reported in other eukaryotic organisms. These genes include the well-characterized pause in human c-myc (16,17) and human hsp70 (18), as well as other genes (19). The pause site in each of these cases occurs within 20-50 bases from the transcriptional start site. The accumulation of a promoter paused Pol II (in the case of hsp70 genes the level is one Pol II per gene) indicates that simple Pol II recruitment or initiation of transcription is not always rate limiting and that pausing may represent an important regulated step in expression of a variety of genes in diverse organisms.
To identify sites of Pol II pausing in an unbiased manner we used this UV cross-linking method to select and clone a collection of DNA sequences binding Pol II molecules in vivo. Transcribed genes are enriched in this collection of cloned DNAs. Analysis of these clones by nuclear run-on assay shows that many contain detectable, transcriptionally engaged Pol II. Moreover, the results indicate that a significant fraction of these engaged RNA polymerases are paused.
The probes used in Table 1 are the [beta]1-tubulin 1.4 and 2.7 kb AvaII fragments from pTu56-94 (20); the Hsp83 3.3 kb BamHI-SalI fragment from aDm4.46 (21); the ribosome 1.1 kb HindIII fragment from DmRyr22 (22). The control DNAs probed in nuclear run-on assays are histone H4 [a 896 bp StuI-AvaI fragment from the 4.8 kb histone repeat unit cloned into the SmaI site of pUC19; (23,23)]; hsp70 [p70X2.6 digested with AvaI and PstI (12) or with BanI and ScaI (23)]; the HSC5-ph6-8 genomic clone (24).
Polyclonal goat anti-Drosophila melanogaster RNA Pol II large subunit was a kind gift from A.Greenleaf (25).
We followed the protocol previously described (26), with the following modifications. Drosophila Schneider line 2 (S2) cells were grown in spinner flasks containing 200 ml medium until they reached a density of 107/ml. Aliquots of 2 × 109 cells in mid-logarithmic growth were placed in a pre-chilled Pyrex lasagna dish inside an ice bath. These cells were irradiated for 10 min using an inverted shortwave transilluminator at 256 nm (UV Products) with the solarizer removed. These cells were aliquoted into four 50 ml polypropylene centrifuge tubes on ice and spun for 5 min in a clinical centrifuge. Cell lysis, nuclear preparations and cesium chloride gradients were performed as described (26). The gradient was fractionated using an 18 guage needle into 1 ml aliquots. The aliquots were checked for maximum genomic DNA concentration by agarose gel electrophoresis and these genomic DNA enriched fractions were pooled. The cesium chloride fraction was dialysed against four changes of 2 l buffer (100 mM Tris-HCl, pH 8, 2 mM EDTA and 0.2% sarkosyl).
The UV cross-linked DNA was digested with HaeIII in a 2 ml reaction mix using 1× React 2 buffer (Gibco BRL), 0.2% sarkosyl, 0.1% Triton X-100 and 20 U HaeIII and incubated overnight at 37°C. CATCH linkers (27) were ligated to the blunt-ended genomic DNA fragments using the following reaction conditions: 2 ml HaeIII digestion mixture and the following (final concentrations), 1× ligation buffer (Gibco BRL), 1 mM ATP, 5 µg/ml CATCH A and CATCH B linkers, 0.1% Sarkosyl in a final volume of 3 ml containing 100 U T4 DNA ligase (Gibco BRL). These ligation reactions were incubated overnight at 14°C. The CATCH linker concatemers were digested with XhoI, which cuts sites at the 3'-end of the CATCH linker sequence. The 3 ml of the above ligation reaction was brought to 4 ml with the following (final concentrations), 1× React 2 buffer (Gibco BRL), 0.2% sarkosyl, 0.1% Triton and 1000 U XhoI. The XhoI digest was incubated at 37°C for 8 h and a further 1000 U XhoI were added at 4 h.
Goat anti-Pol II affinity purified antibody (5 µl) was added to the experimental sample (+ immunoprecipitation) and incubated overnight at 4°C. A control sample was prepared by adding no antibody (- immunopreciptation), which was taken through all the subsequent stages in parallel with the + immunoprecipitation experiment. A slurry of 400 µl protein G-Sepharose (10% in ethanol) was washed with buffer 1 (0.2% sarkosyl, 100 mM Tris-HCl, pH 8, 2 mM EDTA) and resuspended in 500 µl buffer 1. An aliquot of 200 µl of this washed protein G-Sepharose suspension was added to the immunoprecipitation experiments. These reactions were incubated on a rotating platform for at least 4 h at 4°C. The immunoprecipitated Pol II-DNA adducts were spun for 1 min in a clinical centrifuge and the supernatant preserved, which we used later as a control called `total' genomic DNA (this is effectively total DNA, since only a small fraction of the DNA cross-links to Pol II). The protein G-Sepharose precipitate was transferred to 1.6 ml microcentrifuge tubes. The first four washes were in buffer 1 and the next eight washes in buffer 2 (100 mM Tris, pH 9, 0.5 M LiCl, 1% NP40, 1% sodium deoxycholate).
The Pol II-DNA adducts were eluted from the protein G-Sepharose beads by resuspension in 400 µl 0.5% SDS, 50 mM Tris-HCl, pH 8.5, and shaken for 30 min at room temperature. The protein G-Sepharose was separated by centrifugation and elution was repeated three times. The Pol II-DNA adducts were ethanol precipitated using a 1/10 vol 3 M NaOAC and 2.5 vol 100% ethanol and the visible pellets were washed with 70% ethanol. The ethanol precipitates were resuspended in 400 µl 1× proteinase K buffer (0.5% SDS, 10 mM Tris-HCl, pH 8.0, 10 mM EDTA), 4 µl 10 mg/ml proteinase K were added and the reaction mixture digested overnight at 65°C. The proteinase K was removed by phenol/ether extraction and the DNA fragments were ethanol precipitated and resuspended in 20 µl ddH2O. Samples of 5 µl of this product were used in subsequent PCR reactions.
We followed the amplification protocol for CATCH primers described by Kinzler and Vogelstein for their whole genomic PCR procedure (27) with the following modifications. We found increased yields of PCR products when we used only one CATCH primer in the PCR reaction. Optimal yields were obtained by a modified `hot start' procedure. The PCR reactions were assembled at 70°C and started by mixing in the Taq DNA polymerase and overlaying the mineral oil. An Erikcomp Thermocycler was programed as follows: 95°C 0.5 min, 50°C 2 min, 70°C 1.5 min repeated 25 times and 5% of this sample was reamplified. This amplified DNA was purified by PEG precipitation and two additional cycles of PCR were performed (93°C 2 min, 55°C 10 min, 70°C 10 min with a PEG purification of DNA between cycles) to fill in DNA ends more completely. The resulting amplified DNA was digested with EcoRI prior to the cloning step.
Three libraries were constructed: (i) the iDNA library, that was Pol II antibody selected; (ii) the control minus antibody library, from a mock selection with no antibody; (iii) a control `total' genomic library from the supernant of the Pol II immunoprecipitation. The library was constructed using [lambda]ZapII vector according to the manufacturer's (Stratagene) instructions.pBluescript containing the library inserts was excised following the Stratagene protocol. Lambda filter lifts and hybridization, plasmid DNA preparation, dideoxy DNA sequencing, restriction endonuclease digestion, DNA agarose electrophoresis and Southern blotting were all performed using standard methods.
Approximately 4 µg large scale plasmid DNA preparations of iDNA clones were digested with EcoRI, cutting sites within the CATCH linkers. These digested DNA samples were electrophoresed on duplicate 1% agarose gels and transferred to nylon membranes for Southern blots. Nuclear run-on transcription assays were carried out using Schneider line 2 cells (under non-heat shock conditions) according to our standard protocol (12). The reactions were labeled with [[alpha]-32P]UTP; the other nucleotides were cold. The nuclear run-on reaction time was 5 min for Figure 2 and 2 min for Figure 3A. Southern blots were hybridized with the nuclear run-on probes for 24 h in roller bottles containing 6× SSC, 5% SDS, 50% formamide, 10% dextran sulfate and 100 µg/ml sonicated salmon sperm at 42°C. The filters were washed three times in 2× SSC, 0.1% SDS at 65°C, twice in 0.2× SSC, 0.1% SDS at 65°C and once in 0.1× SSC, 0.1% SDS at 65°C. Autoradiography was carried out using both X-ray film and a Molecular Dynamics PhosphorImager.
We developed a scheme to clone DNA sequences that are associated with a specific protein in vivo, as outlined in Figure 1A. This approach is an integration of our protein-DNA cross-linking methods (26) and other techniques employed to clone the target sequences that bind transcription factors in vitro (27). The UV light cross-linked DNA-protein adducts can be subjected to harsh purification procedures, including cell and nuclear lysis with detergents, CsCl ultracentrifugation, and immunoprecipitation. We have used a goat anti-Pol II polyclonal antibody (25) to immunoprecipitate UV-induced Pol II-DNA adducts (26) and have adapted the whole genome PCR procedure (26) to provide sufficient mass of immunoprecipitated DNA for cloning into a phagemid vector. We have termed the resulting library a Pol II immunoselected genomic DNA or iDNA library. Two control libraries were made in parallel with the Pol II iDNA library. The first, a `total' genomic library, was constructed by removing 0.5% of the total genomic DNA after the first immunoprecipitation step and subsequent amplification and cloning. The second, a `minus antibody' library, was constructed in an identical fashion to the iDNA library except that the goat anti-Pol II antibody was omitted.
In previous studies we and others have demonstrated that in vivo cross-linking of proteins to DNA can be used to analytically measure the relative concentration of proteins on specific chromosomal DNA segments (1-8). The specificity of these methods is provided by stringent purification and washing of protein-DNA adducts and by immunoprecipitation with antibodies raised against particular chromosomal proteins. While the immunoprecipitation can be done without prior cross-linking (30,31), the cross-linking step with intact cells is particularly critical to ensure that the protein-DNA complexes are a result of true in vivo interactions and not an artifact of rearrangement during cell and nuclear lysis and subsequent purification. Here we present an extension of these methods that allows preparation of libraries of DNA sequences that are enriched for sequences that interact directly with a specific protein in vivo. This provides a means of isolating new DNA targets of nuclear proteins.
The methodology we have developed for making iDNA libraries for Pol II should be applicable to other DNA binding proteins that cross-link to DNA in vivo. Indeed, UV light treatment of cells and embryos has been shown to cross-link several proteins to specific DNAs. These proteins include RNA Pol II (10), topoisomerase I (3), B52 (32) GAGA factor (33), eve (8), ftz (8) and zeste (8). UV irradiation creates stable cross-links between protein and DNA at an efficiency of ~1 cross-link/60 kb (10). While the use of UV light to couple proteins to DNA is known to cross-link a variety of proteins to DNA, the approach in principle could be used with a variety of cross-linking reagents. Of particular interest is the use of formaldehyde, which has proven very effective at efficiently coupling a variety of proteins to DNA in vivo (4,6,8).
One concern in the cloning of DNA from cross-linked protein-DNA adducts is the damage to the DNA caused by the cross-linking agent, both damage to the DNA in general and, in particular, at the site of the protein cross-link. This can be minimized in several ways. First, the dosage of UV light (or chemical cross-linking agent) should be kept as low as possible. Second, the target DNA segment used for cloning should be as short as possible (here we use HaeIII fragments that are on average ~300 bp). Third, the protein-DNA cross-link should be to one strand of the DNA duplex, so that the complementary strand is available for amplification by the approach described here. Finally, while UV light cross-links are not readily reversible, some cross-linking agents, like formaldehyde, produce reversible cross-links.
The Drosophila genome is 1/20 the size of a mammalian genome. To apply this approach to more complex genomes may require additional steps to reduce background. This could include more washes of immunoprecipitated adducts or resuspension and re-immunoprecipitation with the same or a second antibody to a different epitope of the protein of interest.
We appreciate the limits to the conclusions about promoter pausing that can be drawn from our analysis of the Pol II iDNA library. First, the run-on assay used to analyze iDNA clones is of limited sensitivity and only allows examination of abundantly transcribed genes. Using the hsp70 gene as a standard [it is present at six copies/genome (34) and has one Pol II/gene in non-heat shocked cells (23)] we estimate we can only reproducibly detect >0.1 Pol II molecules on a single copy DNA sequence. Therefore, the many genes expressed at lower levels are undetectable and may in part account for the failure to detect signals in nuclear run-on assays in 83% of the cloned inserts in the iDNA library. Alternatively, some of these negative clones could represent chromatin-bound but non-engaged Pol II or background in the purification and cloning. Second, nuclear run-on assays are only one way of examining pausing. Several techniques [cross-linking (2), permanganate mapping (35) and transcript analysis (14,15)] in addition to run-on assays have been previously used to rigorously demonstrate and characterize promoter paused Pol II on several genes. While these other techniques do not lend themselves to large scale screening of the iDNA library, sarkosyl-stimulated run-on does and serves as a good first indicator of this class of Pol II. Third, there is no guarantee that the sarkosyl-stimulated Pol II complexes (suggestive of paused Pol II) are all situated in a promoter-proximal position. Indeed, pausing at other positions in transcription units will also be of interest. One of the iDNA cloned segements did match a known single copy gene, HSC5, and fine structure mapping located the Pol II pause site to a similar position relative to the promoter as the pause site for the hsp70 gene. Interestingly, sequence comparison of the hsp70 and HSC5 promoters reveals only the TATA box in common, i.e. there are no classical heat shock elements nor GAGA factor binding sites (which are critical for pausing on hsp70) in the HSC5 promoter (24). Although paused Pol II appears to be relatively common in Drosophila and mammals, the rules that specify pausing are not simply identified by sequence comparisons.
In this paper an iDNA library was generated to map the distribution of RNA Pol II on Drosophila chromatin. The use of Pol II provided a predictable test of the method. The constitutively active Hsp83 and [beta]1-tubulin genes were found to be greatly enriched in our library, whereas rDNA, which is transcribed by Pol I, was depleted. In addition to this test of the method, we have analyzed a random sample of the resulting clones by nuclear run-on assay to determine the relative distribution of transcribing and paused Pol II. While pausing was discovered to occur on over half of the small collection of previously cloned abundantly transcribed genes that were tested (13), we did not have an unbiased estimate of the frequency of paused Pol II. Our result that 21% of the detectable transcribing Pol II complexes are paused indicates that a high fraction of chromosome-associated Pol II is in this configuration. This is surprising, but not inconsistent with other results. Greenleaf and colleagues examined the distribution of hyper-phosphorylated and hypo-phosphorylated Pol II on Drosophila polytene chromosomes (36). While large developmental puffs contain transcribing hyper-phosphorylated Pol II, many interbands had predominantly hypo-phosphorylated Pol II. This hypo-phosphorylated Pol II has been demonstrated to be the form of Pol II at the promoter paused sites of a variety of genes (37). Therefore, these immunofluorescence studies would be consistent with the broad distribution of pausing we have seen in this iDNA library screen.
We thank Janis Werner and Jill Sangree for excellent technical assistance, Karen Palter for the HSC5 genomic clone and Arno Greenleaf for Pol II antibody. This work was supported by an NIH grant (GM25232) and an NIH postdoctoral fellowship to A.L.
Nucleic Acids Research
Pages
Introduction
Materials And Methods
Cloned DNAs
Antibody
UV cross-linking and protein-DNA adduct purification
HaeIII digestion, CATCH linker ligation and XhoI digestion of concatemers
Immunoprecipitation
PCR
Construction of libraries
Nuclear run-on transcription assays
Results
Discussion
Acknowledgements
References
REFERENCES
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 6 Feb 1998
Copyright© Oxford University Press, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
K. Sanyal, M. Baum, and J. Carbon Centromeric DNA sequences in the pathogenic yeast Candida albicans are all different and unique PNAS, August 3, 2004; 101(31): 11374 - 11379. [Abstract] [Full Text] [PDF] |
||||
![]() |
C.-H. Wu, Y. Yamaguchi, L. R. Benjamin, M. Horvat-Gordon, J. Washinsky, E. Enerly, J. Larsson, A. Lambertsson, H. Handa, and D. Gilmour NELF and DSIF cause promoter proximal pausing on the hsp70 promoter in Drosophila Genes & Dev., June 1, 2003; 17(11): 1402 - 1414. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. R. Eberhardy and P. J. Farnham Myc Recruits P-TEFb to Mediate the Final Step in the Transcriptional Activation of the cad Promoter J. Biol. Chem., October 11, 2002; 277(42): 40156 - 40162. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. R. Eberhardy and P. J. Farnham c-Myc Mediates Activation of the cad Promoter via a Post-RNA Polymerase II Recruitment Mechanism J. Biol. Chem., December 14, 2001; 276(51): 48562 - 48571. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. T. Lis, P. Mason, J. Peng, D. H. Price, and J. Werner P-TEFb kinase recruitment and function at heat shock loci Genes & Dev., April 1, 2000; 14(7): 792 - 803. [Abstract] [Full Text] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





