ABSTRACT
We have previously shown that DNA demethylation by chick embryo 5-methylcytosine (5-MeC)-DNA glycosylase needs both protein and RNA. RNA from enzyme purified by SDS-PAGE was isolated and cloned. The clones have an insert ranging from 240 to 670 bp and contained on average one CpG per 14 bases. All six clones tested had different sequences and did not have any sequence homology with any other known RNA. RNase-inactivated 5-MeC-DNA glycosylase regained enzyme activity when incubated with recombinant RNA. However, when recombinant RNA was incubated with the DNA substrate alone there was no demethylation activity. Short sequences complementary to the labeled DNA substrate are present in the recombinant RNA. Small synthetic oligoribonucleotides (11 bases long) complementary to the region of methylated CpGs of the hemimethylated double-stranded DNA substrate restore the activity of the RNase-inactivated 5-MeC-DNA glycosylase. The corresponding oligodeoxyribonucleotide or the oligoribonucleotide complementary to the non-methylated strand of the same DNA substrate are inactive when incubated in the complementation test. A minimum of 4 bases complementary to the CpG target sequence are necessary for reactivation of RNase-treated 5-MeC-DNA glycosylase. Complementation with double-stranded oligoribonucleotides does not restore 5-MeC-DNA glycosylase activity. An excess of targeting oligoribonucleotides cannot change the preferential substrate specificity of the enzyme for hemimethylated double-stranded DNA.
Recent published work suggests that the active DNA demethylation reaction may involve RNA alone (1 ) or RNA in combination with protein(s) (2 ). In the first case it is thought that the catalytic function of the RNA is to remove methylcytidine and to transfer it to the RNA. The reaction is apparently resistant to proteinase K and only sensitive to RNases (1 ). In the second case the reaction is carried out by a 5-methylcytosine (5-MeC)-DNA glycosylase which is associated with a RNA (2 ). Both protein and RNA are necessary for the reaction since the activity of the enzyme is abolished by proteinase K or RNase treatment (2 ). The involvement of both protein and RNA in the catalysis of enzymic reactions is well documented. Such an association is exemplified by the ribosome particles and by the aminoacyl-tRNAs with the translation initiation factor(s) (3 ). Group II intron RNAs associate with a DNA endonuclease and cleave one strand of the DNA duplex while the protein associated with the DNA cleaves the other strand in a site-specific manner (4 ). RNase P is also a complex between an RNA and a protein. In this particular case the RNA alone is capable of processing precursor tRNAs in the presence of high concentrations of Mg2+, whereas at low Mg2+ concentrations it requires the presence of the protein (5 ). In the case of telomerase the RNA does not have any catalytic function, but rather serves as a primer for the telomerase reaction (6 ,7 ). More recently it has been shown that methylation of the ribose moiety of rRNA also needs short sequences of antisense snRNAs for targeting of the methylation reaction. In this case 10-20 bases complementary to the rRNA associate with the enzyme protein (8 -10 ). Circumstantial evidence suggests that DNA methyltransferases may also require RNA for de novo DNA methylation (11 ,12 ). Similarly, we show here that the role of cloned RNA prepared from purified 5-MeC-DNA glycosylase is to guide the enzyme to the demethylation site.
Chick embryo 5-MeC-DNA glycosylase was purified as previously published (2 ,13 ). The purified enzyme fraction was loaded onto a 10% SDS-polyacrylamide gel. The band showing 5-MeC-DNA glycosylase activity, which migrated at 52 kDa, was cut out and the enzyme extracted from the gel by incubation with 0.15 M NaCl, 20 mM HEPES, 1 mM EDTA, 0.1% SDS, 5 mM DTT for 1 h at room temperature. The sample was filtered through a Millipore Ultrafree-MC filter and precipitated with 4 vol acetone and 100 µg Dextran-T 70 at -80oC. After centrifugation and three washes with acetone (90% v/v in water) the samples were resuspended in 100 µl 20 mM HEPES, 1 mM EDTA, 100 mM NaCl and digested for 30 min at 37oC with 50 µg proteinase K. The RNA was extracted with phenol:chloroform and precipitated with ethanol. An aliquot of the acetone-precipitated enzyme was denatured in guanidinium hydrochloride and slowly renatured by dialysis (13 ). The renatured enzyme was tested as previously described (13 ).
Two different methods were used to obtain cDNA from the purified RNA.
The first method involved ligation of a phosphorylated 24mer oligoribonucleotide (R1, 5'-GGUACCCUCGAGGAAUUCGCGACG-3') to the 3'-end of the RNA using T4 RNA ligase (Biolabs). The ligation was performed for 30 min at 37oC in the presence of 40 U RNase inhibitor (Boehringer) and 2.5% PEG 8000. The ligation product was used as template for cDNA synthesis with an oligodeoxyribonucleotide complementary to R1 (P1, 5'-CGCGAATTCCTCGAGGGTACC-3'). Reverse transcription was performed for 15 min at 70oC using 5 U rTth DNA polymerase (Perkin Elmer) in the presence of 40 U RNase inhibitor. The RNA was removed by incubation for 30 min at 60oC in 0.2 M NaOH followed by ethanol precipitation. A phosphorylated oligodeoxyribonucleotide (P2, 5'-GGTACCCTCGAGGAATTCGCGACG-3') was then ligated at the 3'-end of the cDNA using T4 RNA ligase. The ligation product was used as template for PCR amplification and the PCR product cloned using the TA Cloning Kit (Invitrogen). Individual clones were sequenced using the Sequenase kit (Amersham).
A second method involved direct cDNA synthesis with rTth DNA polymerase and random hexamers. The reaction was first incubated at 37oC for 20 min with 5 U rTth DNA polymerase, 50 ng random hexamers and 40 U RNase inhibitor and then transferred to 70oC to resolve any RNA secondary structures. After 5 min 2.5 U rTth polymerase were added and the reaction was incubated at 70oC for a further 15 min. RNA was removed as described above, then the deoxyoligonucleotides P2 and P3 (5'-CGTAGGATCCGCGGCCGCGAG-3') were ligated onto the 3'- and the 5'-ends of the cDNA respectively. The ligation product served as template for PCR and the PCR product was analyzed as above. The cloned cDNA was then sequenced using either the Sequenase kit (Amersham) or an automated sequencer (Perkin Elmer 377 DNA sequencer). Oligonucleotides were synthesized by the phosphoramidite approach using an Applied Biosystems 392 DNA/RNA synthesizer.
Total RNA from 12 day old chicken embryos was isolated by the guanidinium thiocyanate procedure of Chomczynski and Sacchi (14 ). Poly(A)-containing mRNA was isolated from total RNA by chromatography on oligo(dT)-cellulose type 7 (Pharmacia). Nuclei were purified through sucrose gradients according to Sierra (15 ). Nuclear RNA was isolated by digesting the nuclei in 10 mM Tris, pH 7.5, 10 mM EDTA, 10 mM NaCl, 0.5% SDS (w/v), 2% (v/v) 2-mercaptoethanol containing 1.5 mg/ml proteinase K. After 3 h incubation at 50oC DNA was sheared by several passages through a 22 gauge needle. RNA was sedimented through a cushion of 5.7 M CsCl (16 ,17 ). Dot blot hybridization of purified RNA was carried out on nylon membranes (Micron Separations Inc.). Hybridization was at 37oC for 24 h in 0.25 M sodium phosphate, pH 7.2, 1 mM EDTA and 7% (w/v) SDS. Filters were washed in 0.2* SSC, 0.02% SDS (w/v) at 37oC. The labeled probes (concentration 4 × 106 c.p.m./ml) were 5'-GCTTATTTTTCATTTTGGCGACTATGTGTAAAGTCGTC-3' for clone 1 (probe 1) and 5'-GGGTTGTCAGTATCTCGTTCGGTCACCGTGATTGCC-3' for clone 4 (probe 4).
The standard assay for5-MeC-DNA glycosylase was carried out as previously described (13 ). The labeled DNA substrate was a double-stranded hemimethylated oligonucleotide with a methylated lower strand (5'-TCACGGGATCAATGTGTTCTTTCAGCTCmCGGTCACGCTGACCAGGAATACC-3'). All reaction products were analyzed on 20% polyacrylamide-urea sequencing gels. The gels were exposed for 30-60 min to X-ray films at -80oC.
Two different versions of the complementation test were used.First version. In a total volume of 50 µl containing 20 mM EDTA, 20 mM EGTA, 20 mM HEPES, pH 7.5, 50 mM NaCl, 4 mM Pefabloc (Boehringer), 50 µg enzyme grade bovine serum albumin (BSA) and 30 µg post-heparin-Sepharose 5-MeC-DNA glycosylase fraction were consecutively assembled. Where indicated the enzyme was inactivated by incubating for 20 min at 37oC in the presence of 0.5 µg heat-treated pancreatic RNase A. At the end of the preincubation all tubes were put on ice and each sample treated with RNase A received a 1/10 vol 50 mM dithiothreitol (DTT), 1000 U RNasin (100 U/µl), 5-10 µg appropriate recombinant RNA or oligoribonucleotides and 10 ng 32P-labeled, double-stranded hemimethylated oligonucleotide. Before onset of preincubation the positive controls received a final concentration of 5 mM DTT, 1000 U RNasin. After incubation at 37oC for 45 min all samples were diluted with 150 µl H2O and extracted with phenol:chloroform. The supernatant fractions were then ethanol precipitated, dissolved in 95% formamide-dye, denatured for 5 min at 95oC and separated on a 20% polyacrylamide-urea sequencing gel.Second version. In a total volume of 50 µl containing 20 mM HEPES, pH 7.5, 50 mM NaCl, 5 mM CaCl2, 4 mM Pefabloc, 50 µg enzyme grade BSA and 30 µg post-heparin-Sepharose 5-MeC-DNA glycosylase fraction were consecutively added. Where indicated the enzyme was inactivated by preincubation at 37oC for 7 min in the presence of 50 U microccocal nuclease. At the end of preincubation all of the tubes were chilled on ice and each sample received EDTA and EGTA at final concentrations of 20 and 25 mM respectively. In addition, each tube received a 1/10 vol 50 mM DTT, 200 U RNasin, 5-10 µg appropriate oligoribonucleotide and 10 ng 32P-labeled double-stranded hemimethylated substrate. The positive controls received EDTA, EGTA, DTT and RNasin prior to preincubation. The experiment was continued as for the first version of the complementation test. Recombinant porcine RNase inhibitor was produced in large scale as described by Neumann et al. (18 ).
Benzamidine was purchased from Fluka AG (Buchs/SG, Switzerland). Phenylmethylsulfonylfluoride, Pefabloc, proteinase K, RNase A (DNase-free) were obtained from Boehringer Mannheim, Microccocus nuclease was from Promega. Polynucleotide kinase and restriction enzymes were purchased from Biofinex (Praroman, Switzerland). [[alpha]-32P]dATP and [[gamma]-32P]ATP triethylammonium (3000 Ci/mmol) were purchased from Amersham. Some oligonucleotides, DNA and RNA were synthesized by Microsynth (Balgach, Switzerland).
We have recently shown that RNA isolated from gel purified 5-MeC-DNA glycosylase could restore the activity of the enzyme after it had been inactivated with RNase A (2 ). It was therefore of interest to clone and characterize the RNA present in the purified enzyme. RNA was cloned as described in Materials and Methods. The clones have inserts ranging from 200 to 600 bp. The sequencing of these clones shows that they are all different, with no significant homology. Accession numbers to the EMBL nucleotide sequences library are: clone 1, Y14827; clone 2, Y14828; clone 3, Y14829; clone 4, Y14830; clone 5, Y14831; clone 8, Y14832. A computer analysis of the sequences showed no significant homology with any other known RNA. However, as shown in Figure 1 , all the clones have a high density of CpGs. On average they have one CpG per 14 bases and the average ratio of CpG/GpC is 1.1. The dot blot hybridization shown in Figure 2 indicates that total RNA, nuclear RNA or poly(A)-containing mRNA prepared from 12 day old chicken embryos hybridized with probes 1 and 4 (derived from clones 1 and 4). Northern blot hybridization of probe 4 with total RNA and poly(A)-containing mRNA shows a specific signal just below the 18 S rRNA band (preliminary results, not shown).
Sense or antisense RNA encoded by clone 1 of Figure 1 was produced by using either T7 or Sp6 RNA polymerase respectively. Purified RNA was then tested with the complementation assay as described in Materials and Methods. Figure 3 (upper) shows that treatment of 5-MeC-DNA glycosylase with pancreatic RNase A (lane 2) completely abolished enzyme activity. However, if an excess of RNase inhibitor is added following preincubation of the enzyme with pancreatic RNase together with the recombinant RNA in the sense (lane 4) or antisense (lane 3) orientation, part of the original activity can be restored. This shows that the recombinant RNA from one single clone has similar properties to the total RNA purified from the enzyme (2 ). The recombinant RNA obtained from clone 4 of Figure 1 gives similar results in the complementation test (data not shown). A look at the sequences of the sense and antisense RNAs of clone 1 of Figure 1 reveals that the sense RNA contains the six bases CTCCGG complementary to the target methylated site of the labeled DNA substrate, whereas the antisense RNA has only the four bases CCGG complementary to the same demethylation site and the CCGG sequence is situated in a double-stranded structure of the RNA. Figure 3 (lower) shows that recombinant RNA, as previously shown for total RNA (2 ) isolated from the purified enzyme, is totally inactive when incubated with labeled DNA substrate in the absence of protein. Identical results were obtained with recombinant RNA obtained from clone 4 of Figure 1 (data no shown).
It was shown above that short stretches of recombinant RNA were complementary to the target of DNA demethylation. In order to test the possibility that one of the functions of the RNA was targeting of the demethylation reaction, a series of oligoribonucleotides were tested in the complementation assay. The oligoribonucleotide GCUCCGGUCAC is complementary to the non-methylated CpG present in the hemimethylated DNA duplex, whereas GUGACCGGAGC is complementary to the methylated site of the same substrate. The oligoribonucleotide CUCUCUCUCUU is not complementary to the labeled DNA substrate. The results presented in Figure 4 (upper, lane 4) show clearly that only the oligoribonucleotide GUGACCGGAGC, complementary to the methylated site in the DNA duplex, is able to restore enzyme activity in the complementation test. The other ribonucleotides had no effect (lanes 3 and 5). Additional controls show that the oligoribonucleotide GUGACCGGAGC in double-stranded form is incapable of restoring 5-MeC-DNA glycosylase activity (data not shown). Since the present results rely heavily on the reliability of the complementation assay, the possibility that the results obtained were due to competition between the added oligoribonucleotides and endogenous RNA (from 5-MeC-DNA glycosylase) in the presence of an incompletely inactivated RNase had to be ruled out. Figure 4 (upper, lane 5) shows clearly that 10 µg CUCUCUCUCCU, which was not complementary to the DNA substrate, did not restore 5-MeC-DNA glycosylase activity, thus ruling out a non-specific effect of the oligoribonucleotides. In addition, labeled oligoribonucleotides incubated in the presence of 50 U Microccocus nuclease, 5 mM CaCl2 and 25 mM EGTA showed very little degradation of labeled RNA (data not shown).
Figure
A series of oligoribonucleotides (11 bases long) containing different numbers of bases complementary to the target sequence of demethylation were synthesized. They were (complementary bases underlined): GUGACCGGAGC (11 bases), ACGACCGGAGU (8 bases), UCUACCGGAUA (6 bases), UCAUCCGGUAU (4 bases) and ACAUCCGUCUA (3 bases). When tested in the complementation assay the oligoribonucleotides containing between 4 and 11 bases complementary to the target sequence of the DNA substrate give a similar level of reactivation of the Microccocus nuclease-inactivated 5-MeC-DNA glycosylase (care has to be taken that the other bases of the 11 base long oligoribonucleotides are not complementary to either DNA strand of the labeled substrate) (Fig. 5 , lanes 3-6). However, the presence of only 3 bases complementary to the demethylation site (lane 7) is insufficient to restore enzyme activity. This means that at least 4 bases, one mCpG and two adjacent bases, are required for efficient targeting of the demethylation reaction. The total number of possible combinations of CpG flanked with A, T, G or C is 16. Figure 1 (last column) shows that the cloned RNAs have between 75 and 100% of all of the possible 16 combinations.
To address the question of whether or not the oligoribonucleotides could be replaced by the same sequence of oligodeoxyribonucleotides the complementation assay was carried out using the two oligodeoxyribonucleotides GCTCCGGTCAC (complementary to the non-methylated strand) and GTGACCGGAGC (complementary to the methylated strand of the hemimethylated DNA duplex). Figure 6 (lanes 3 and 4) shows clearly that neither of the oligodeoxyribonucleotides were able to complement RNase-inactivated 5-MeC-DNA glycosylase, thus demonstrating the requirement for RNA in restoring enzyme activity.
Figure
Figure
We have shown previously that 5-MeC-DNA glycosylase cleaved hemimethylated DNA preferentially and that non-methylated or symmetrically methylated DNA sequences were very poor substrates (13 ). By saturating the enzyme with RNA complementary to the methylated target sequence is it possible to modify the substrate specificity? Figure 7 (left, lane 2) shows that an excess (10 µg) of the targeting oligoribonucleotide GUGACCGGAGC does not modify activity of the intact enzyme towards hemimethylated DNA. The same experiment carried out with symmetrically or unmethylated DNA (lane 2 of the middle and right hand panels) indicates that an excess of the targeting RNA cannot modify specificity of the enzyme towards symmetrically methylated or unmethylated DNA substrates. Attempts to target a site other than CpG were unsuccessful (data not shown), thus confirming our previously published results (19 ).
As shown by sequencing, the RNA present in the purified 5-MeC-DNA glycosylase is highly heterogenous and all clones tested so far have different sequences (altogether 14 different clones have been characterized). However, these clones have a common feature: they are all very rich in CpGs (Fig. 1 ). On average they have one CpG per 14 bases. In addition, the ratio CpG/GpC is on average 1.1 for the six clones shown in Figure 1 (bulk DNA has a ratio of 0.2). This is a strong indication that the RNA linked to 5-MeC-DNA glycosylase may be transcribed from CpG islands (20 ). Dot blot hybridization carried out with total RNA, nuclear RNA and poly(A)-containing mRNA from 12 day old chicken embryos shows that RNA associated with purified 5-MeC-DNA glycosylase is indeed present in the mRNA fraction. We are presently testing whether the presence of one of these CpG-rich RNA associated with 5-MeC-DNA glycosylase influences in vivo demethylation of its coding DNA. As we have seen in Figure 5 , we need a minimum of 4 bases, including the CpG, for recognition of the demethylation site (this in the absence of any additional base complementary to the opposite DNA strand). Therefore, due to their heterogeneity, it appears that these RNAs do not represent a universal targeting sequence for DNA demethylation. Clone 4 (Fig. 1 ), which is 618 bases long, contained all 16 of the possible combinations of CpG flanked by A, T, C or G. This RNA by itself should be sufficient to serve as a universal targeting sequence. So why are so many different, unrelated RNA sequences required? These different RNAs tightly linked to 5-MeC-DNA glycosylase possibly represent transcripts from CpG islands which should remain unmethylated. Should one of the CpGs in a CpG island become methylated during DNA replication it could form a hemimethylated substrate. Since both DNA methyltransferase and 5-MeC-DNA glycosylase prefer hemimethylated DNA as substrate, a specific mechanism should exist to determine whether or not the hemimethylated site becomes fully methylated or demethylated. Different strategies can be envisaged. For example, additional regulatory protein(s) or RNA(s) could favor one or the other reaction and/or the molar ratio between the two enzymes at a precise time point of replication could decide whether a given site is methylated or demethylated. It is conceivable that the RNA tightly bound to 5-MeC-DNA glycosylase may have a dual function: targeting of demethylation and inhibition of DNA methyltransferase, thus favoring the demethylation reaction. Preliminary experiments have shown that DNA methyltransferase purified from HeLa cells is strongly inhibited by recombinant RNA (Thiry, Frémont and Jost, unpublished results).The use of 4 bases as a recognition sequence could possibly also serve as a very efficient way to target demethylation sites in the non-coding region of a gene and the high density of CpGs in the RNA could increase the probability of reaction. Some observations made in different cell systems show that there is a positive correlation between presence of unmethylated CpG islands and presence of active 5-MeC-DNA glycosylase. Conversely, cells with heavily methylated CpG islands have no trace of 5-MeC-DNA glycosylase activity (Jost, unpublished results). However, the causality between the presence of unmethylated CpG islands and activity of 5-MeC-DNA glycosylase remains to be demonstrated. Keeping the CpG islands free of methylation may require both cis- and trans-acting elements. For example, it is known that Sp1 binding sites flanking CpG islands are essential to keep them methylation free (21 ,22 ). Moreover, there is evidence that proteins like NF-[kappa]B may be involved in active demethylation of specific genes in B cells (23 ). One limitation of RNA as an efficient targeting molecule for in vitro DNA demethylation is its secondary structure. The sequences of the cloned RNAs as shown by computer analysis all showed complex secondary structures. In addition, we know from our in vitro experiments that double-stranded oligoribonucleotides cannot target demethylation under our experimental conditions (data not shown). This could explain why the CCGG present in a double-stranded RNA region (antisense RNA) of clone 1 was inefficient in complementing the nuclease-inactivated 5-MeC-DNA glycosylase (see Fig. 3 , upper, lane 3).Therefore, one could speculate that in vivo such secondary structures may be selectively destabilized by proteins.
As we have shown previously, the population of RNA tightly associated with the active enzyme has no trace of catalytic activity when incubated under various conditions in the presence of the DNA substrate but in the absence of the protein moiety (2 ). Similarly, pure recombinant RNA in the sense orientation can only complement 5-MeC-DNA glycosylase that has been inactivated by RNase (Fig. 3 ) but is without catalytic activity when incubated alone with the labeled DNA substrate in the absence of protein (Fig. 3 , lower). However, at this stage one cannot completely rule out that full-size RNA may have some catalytic activity. The nature of the tight association of the RNAs with 5-MeC-DNA glycosylase and how the enzyme performs its reactions are still unknown. These questions will be addressed in the near future, when we have cloned the protein moiety of this enzyme.
We would like to thank Mrs Yan-Chim Jost for her technical assistance and for typing the manuscript. We are also grateful to Drs E.J.Oakeley and J.Paszkowski for their critical reading of this manuscript. We would also like to thank Novartis for financial support of this project.
REFERENCES

