ABSTRACT
BssHII restriction endonuclease cleaves 5'-GCGCGC-3' on double-stranded DNA between the first and second bases to generate a four base 5' overhang. BssHII restriction endonuclease was purified from the native Bacillus stearothermophilus H3 cells and its N-terminal amino acid sequence was determined. Degenerate PCR primers were used to amplify the first 20 codons of the BssHII restriction endonuclease gene. The BssHII restriction endonuclease gene (bssHIIR) and the cognate BssHII methyltransferase gene (bssHIIM) were cloned in Escherichia coli by amplification of Bacillus stearothermophilus genomic DNA using PCR and inverse PCR. BssHII methyltransferase (M.BssHII) contains all 10 conserved cytosine-5 methyltransferase motifs, but motifs IX and X precede motifs I-VIII. Thus, the conserved motifs of M.BssHII are circularly permuted relative to the motif organizations of other cytosine-5 methyltransferases. M.BssHII and the non-cognate multi-specific [Phi]BssHII methyltransferase, M.[Phi]BssHII [Schumann,J. et al. (1995) Gene, 157, 103-104] share 34% identity in amino acid sequences from motifs I-VIII, and 40% identity in motifs IX-X. A conserved arginine is located upstream of a TV dipeptide in the N-terminus of M.BssHII that may be responsible for the recognition of the guanine 5' of the target cytosine. The BssHII restriction endonuclease gene was expressed in E.coli via a T7 expression vector.
DNA methylation in bacteria plays an important role in chromosome repair and replication, and in protecting the bacterial chromosome from the action of its own restriction enzymes (1 ,2 ). Methylated DNA can also mark DNA for degradation by the methylation-dependent restriction systems such as McrBC, McrA and Mrr (3 -5 ). In recent years, a large number of restriction-modification (R-M) systems have been cloned by using the phage challenge method (6 ), the methyltransferase (MTase) selection method (7 ), or the endo-blue method (8 ,9 ). Among the cloned type II R-M systems, two major classes of MTases have been found so far, namely, cytosine-5 methyltransferases (C5-MTase) and those that modify exocyclic nitrogens (N6-adenine MTases and N4-cytosine MTases) (10 ,11 ). Ten conserved amino acid (aa) sequence motifs are found among most of the C5-MTases (10 -12 ). These 10 conserved motifs are usually organized in a linear fashion at the aa level. Six of the 10 motifs (I, IV, VI, VIII, IX and X) are highly conserved among all reported C5 MTases. The variable region (target-recognizing domain, TRD) is usually located between motifs VIII and IX.
The organization of conserved motifs in a few C5-MTases differs from the majority of C5-MTases. For example, in M.EcoHK311 (5'-YGGCCR-3') motif IX is located in a separate small subunit ([beta]) (13 ); In M.AquI (5'-CYCGRG-3'), part of the variable region plus motifs IX and X are also found in a small subunit (14 ). In both cases MTase activity can be reconstituted by combining the large and small subunits. Neither the large nor the small subunit alone show any MTase activity (13 ,14 ). Variations in the C5-MTase motif order has also been observed in M.Alw26I (5'-GTm5CTC-3', complementary strand 5'-GAGm6AC-3') (15 , and Bitinaite,J. cited in ref. 16 ).
The crystal structures of two C5-MTases, M.HhaI and M.HaeIII, have been solved (17 ,18 ). In the co-crystal structure of the M.HhaI-DNA complex, the enzyme forms a large domain consisting of motifs I-VIII, most of motif X, a small domain containing the variable region and motif IX. A DNA binding cleft is formed between the large and small domains: backbone phosphate interactions are made to the target DNA via an active-site loop, motif VIII, and the TRD of M.HhaI (17 ). A conserved segment, R-X8-9-T-I/L/V, within the TRD of the C5-MTases was identified that makes a phosphate contact and for the recognition of the guanine 5' to the target cytosine (19 ,20 ).
The recognition sequence of BssHII is 5'-GCGCGC-3' on double-stranded DNA, and cleavage occurs between the first and second bases on both strands (21 ). While trying to clone the BssHII R-M system from Bacillus stearothermophilus H3, we have encountered another unususal cytosine MTase. Here we report the cloning and expression of the BssHII R-M system in Escherichia coli and the unique organization of its conserved C5-MTase motifs.
LB broth and LB agar were prepared as previously described (22 ). Where applicable, media were supplemented with 100 µg/ml ampicillin (Ap), 50 µg/ml kanamycin (Km) or 33 µg/ml tetracycline (Tc). Restriction enzymes, DNA modifying enzymes, vectors and DNA size markers were from New England Biolabs, Inc.
The BssHII-producing strain B.stearothermophilus H3 was obtained from N.Welker (21 ). Vector pLG339 (ATCC 37131) was from ATCC. The T7 expression vector, pET21at, is a pET21a derivative that contains four E.coli transcription terminators upstream of the T7 promoter (23 ).
The native BssHII restriction endonuclease was purified to near homogeneity by chromatography through heparin-Sepharose®, DEAE-Sepharose®, Q-Sepharose®, hydroxylapatite and Mono Q® FPLC (Pharmacia) (a detailed purification procedure for BssHII restriction endonuclease will be provided upon request). The purified protein was subjected to electrophoresis and electroblotted as described previously (24 ,25 ). The membrane was stained with Coomassie blue R-250, and the estimated 46 kDa protein band was excised and subjected to sequential degradation in an automated sequencer (ABI model 470A).
The procedure of inverse PCR was followed as previously described (26 ). Recombinant AmpliTaq® polymerase was from Perkin Elmer. Inverse PCR template was prepared by ligation of digested genomic DNA at a low concentration (2 µg/ml) at 16oC overnight. Inverse PCR reactions were performed at 95oC 1 min, 60oC 1 min and 72oC 2 min for 30 cycles with AmpliTaq® DNA polymerase. PCR reactions were run for 15-20 cycles at 95oC 1 min, 60oC 1 min, 72oC 1-2 min with Vent® or Deep VentTM DNA polymerase.
A forward primer d(ATG GGN GAR AAY CAR GA) and a reverse primer d(AC YAA YTG NGC YTT RTC) were synthesized based on the N-terminal aa sequence of BssHII endonuclease protein and used to amplify the first 19 codons of the bssHIIR gene. Two sets of primers were used in inverse PCR to clone the rest of the bssHIIR gene:
d(GAT ATT GGA CAA GGC CCA ACT GGT) and d(TAT TGA TTC TTG GTT TTC TCC CAT)
d(TCC GCT AAT TAC CTT ACC ATT ATT GGT) and d(CTT TAA CTT CAG CCA ATA GCA TTA TGT)
Two sets of inverse PCR primers were used to clone the bssHIIM gene:
d(TCT TTC GTC GCT CAG GTT CTG AAG TAC) and d(TGA TTA AAA ACA AAG TCG AAA GAT TCG).
d(GGA AAA TGT AGC GAA CTT GAA AGG TGT) and d(ACA AAG AAT GGA GGG TTG ATT TTC TCA).
The entire bssHIIM gene was amplified by PCR using two primers:
d(CAA GGA TCC GGA GGT TAA TTA AAT GAA TGG ATT AGA GAA AAC TTC CAA T) and d(TTC GGA TCC TTA AAC AAG TTT AGG TAA ACC TTT GAA GGC). The PCR product was cleaved with BamHI and inserted into BamHI cut and CIP treated plasmid vector pLG339.
Genomic DNA was prepared from B.stearothermophilus H3 cells as previously described (27 ). Plasmid DNA was prepared by the standard method (28 ,29 ) or by Qiagen columns. Plasmid DNA was sequenced using the AmpliTaq® DNA polymerase sequencing kit and an ABI Model 373A automated DNA sequencer.
The MTase selection method (7 ) was first used to clone a MTase that could modify BssHII sites in vivo. A Sau3AI partial B.stearothermophilus H3 genomic DNA library was constructed using the pLITMUS28' vector (pUC19 origin with two BssHII sites). After BssHII digestion of the Sau3AI partial library, and re-transformation of surviving plasmids into new RR1 competent cells, individual plasmids were isolated and digested with BssHII to test for resistance. After screening >100 transformants, six BssHII-resistant plasmids were isolated. In addition to resistance to BssHII digestion, these plasmids were also resistant to BsrFI (5'-RCCGGY-3') and HaeII digestion (5'-RGCGCY-3') (data not shown). It was concluded that we had cloned the multi-specific [Phi]BssHII MTase [M.[Phi]BssHII]. (In agreement with T.A.Trautner and B.Slatko, the multi-specific BssHII MTase is renamed as M.[Phi]BssHII. The MTase in the BssHII R-M system is named M.BssHII.) The multi-specific [Phi]bssHIIM gene was cloned previously [(30 ); D.O.Nwankwo et al., GenBank accession no. U51733]. Cloning and sequencing 3301 bp of surrounding DNA did not reveal any coding frames that match the N-terminal aa sequence of the BssHII restriction endonuclease (D.O.Nwankwo, B.Slatko and G.G.Wilson, personal communication).
The native BssHII restriction endonuclease was purified to near homogeneity by chromatography and the following N-terminal aa sequence was obtained:
(M)GENQESIWANQILDKAQLVS(?)PETHXQN(?)XAD (where X = uncertain aa residue and ? = questionable aa residue). Degenerate primers were designed from the first five aa residues (MGENQ) and residues 15-20 (DKAQLV). Multiple PCR products were obtained in the PCR reaction by using the degenerate primers. The expected PCR product (59 bp) was gel-purified and cloned into HincII digested and CIP-treated pUC19. The insert was sequenced and the DNA sequence was entirely consistent with the aa sequence.
In type II R-M systems, the endonuclease gene and its cognate MTase gene are usually arranged in close proximity to each other (9 ). The DNA upstream and downstream of the bssHIIR gene was amplified by inverse PCR, cloned and sequenced. The upstream sequence has extensive homology to known 16S rRNA sequence (data not shown).
Three downstream inverse PCR products were cloned and sequenced. A partial ORF was found in the newly derived 895 bp sequence. When the DNA sequence was compared to the known genes in GenBank using Blastx (31 ), it was found that this partial ORF has aa sequence similarity to the known C5-MTases.
A second inverse PCR was performed to amplify the remainder of the MTase gene. The inverse PCR product from StyI digested and self-ligated DNA was cloned and the insert was sequenced. A stop codon was found in the newly derived 250 bp DNA sequence. The entire bssHIIM gene is 1128 bp, encoding the 375 aa M.BssHII protein with a predicted molecular mass of 42.2 kDa. To test the function of this MTase, the entire bssHIIM gene was amplified by PCR and inserted into vector pLG339 (32 ). Ten clones with inserts were co-transformed with pLITMUS28 (with one BssHII site), and plasmid DNA was prepared from the co-transformants and digested with BssHII. Four out of 10 tested displayed full resistance to BssHII digestion (data not shown). The others showed partial resistance, probably as the result of incorrect insert orientation or PCR mutations. It was concluded that M.BssHII is the cognate functional MTase.
To test methylation specificity of M.BssHII in vivo, mixed plasmid DNA was isolated from co-transformants of pLG339-M.BssHII and the testing plasmids with BsrFI (Cfr10I isoschizomer), HaeII, MluI or SacII sites and the plasmid DNA was cleaved with the corresponding restriction enzymes. It was found that the testing plasmids were digested by BsrFI, HaeII, MluI or SacII (data not shown), indicating that M.BssHII does not modify BsrFI, HaeII, MluI or SacII sites in vivo.
The majority of the reported C5-MTases contain 10 conserved aa motifs that are organized in a linear fashion with a variable region flanked by motifs VIII and IX (1 ,10 ,33 ). When M.BssHII was compared with other C5-MTases, it was found that motifs IX and X precede motif I in the aa sequence (Fig. 1 ). Thus, the M.BssHII contains a circular permutation of the C5-MTase motifs. A schematic diagram of the conserved C5- MTase motifs are shown in Figure 2 . Variation of conserved motif order has also been observed in N4-cytosine and N6-adenine MTases (16 ,34 ).
To detect the aa sequence similarity between M.BssHII and the multi-specific M.[Phi]BssHII, the aa sequences were compared using the BESTFIT program (Genetics Computer Group, Inc.). The two MTases show 34% identity from motifs I-VIII (Fig. 3 A); and 40% identity in the region of motifs IX and X (Fig. 3 B).
Figure
The multi-specific M.[Phi]BssHII contains five repetitive segments (E1-E5) within the variable region and the specificity determinants for modification of 5'-GCGCGC-3' site has been localized in E5 (35 ). Figure 3 C shows the aa alignment of the putative TRD of M.BssHII and the TRD E4 and E5 of M.[Phi]BssHII. In M.BssHII, residues T83V84 are found upstream of motifs IX and X in the N-terminus region. This dipeptide motif T I/L/V/M is conserved in all reported C5-MTase TRD (11 ,12 ). Upstream of the T83V84 dipeptide, an R residue was located at the -9 position, which is a conserved residue for guanine recognition 5' to the modified cytosine among C5-MTases containing 5'-GC-3' as part of the recognition sequence. Two residues H94 and P95 are located downstream of T83V84 in M.BssHII. The dipeptide HP (H448 and P449) in M.[Phi]BssHII is an important part of the 5'-GCGCGC-3' recognition determinant; insertion of a V residue between the HP residues selectively abolishes methylation of the 5'-GCGCGC-3' target (35 ). A second T363V364 dipeptide is found in the C-terminus of M.BssHII (Fig. 1 ), but this segment lacks the conserved R residue at the -9 position and the HP dipeptide downstream. Therefore, based on the conserved residues of the variable region of C5-MTase, we propose that the varible region of M.BssHII is located, at least in part, in the N-terminal part of the protein (upstream of motifs IX and X). According to Malone et al., a MTase with variable region located at the N-terminus followed by motifs X, I, II, III and IV would be in the [zeta] family (16 ).
We thank Mahul Ganatra, Laurie Mazzola, Hong Ruan, Nancy Blease and Suzanne Krotee for technical assistance; Richard Roberts, Ira Schildkraut and Jay Wayne for critical comments. This work was supported by New England Biolabs, Inc. JP was supported by NIH grant #GM46127.
REFERENCES
