Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (206K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (24)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Lykke-Andersen, J
Right arrow Articles by Kjems, J
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lykke-Andersen, J
Right arrow Articles by Kjems, J
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 1995 Oxford University Press 3982-3989

Footnote

Protein footprinting approach to mapping DNA binding sites of two archaeal homing enzymes: evidence for a two-domain protein structure

Protein footprinting approach to mapping DNA binding sites of two archaeal homing enzymes: evidence for a two-domain protein structure Jens Lykke-Andersen , Roger A. Garrett and Jørgen Kjems 1, *

Institute of Molecular Biology, Copenhagen University, Sølvgade 83H, DK-1307 Copenhagen K, Denmark and 1 Department of Molecular and Structural Biology, Aarhus University, C. F. Møllers Allé, Building 130, DK-8000 Århus C, Denmark

Received July 6, 1996; Revised and Accepted September 5, 1996

ABSTRACT

The archaeal intron-encoded homing enzymes I- Por I and I- Dmo I belong to a family of endonucleases that contain two copies of a characteristic LAGLIDADG motif. These endonucleases cleave their intron - or intein - alleles site-specifically, and thereby facilitate homing of the introns or inteins which encode them. The protein structure and the mechanism of DNA recognition of these homing enzymes is largely unknown. Therefore, we examined these properties of I- Por I and I- Dmo I by protein footprinting. Both proteins were susceptible to proteolytic cleavage within regions that are equidistant from each of the two LAGLIDADG motifs. When complexed with their DNA substrates, a characteristic subset of the exposed sites, located in regions immediately after and 40-60 amino acids after each of the LAGLIDADG motifs, were protected. Our data suggest that the enzymes are structured into two, tandemly repeated, domains, each containing both the LAGLIDADG motif and two putative DNA binding regions. The latter contains a potentially novel DNA binding motif conserved in archaeal homing enzymes. The results are consistent with a model where the LAGLIDADG endonucleases bind to their non-palindromic substrates as monomeric enzymes, with each of the two domains recognizing one half of the DNA substrate.

INTRODUCTION

Homing enzymes are site-specific DNases, encoded by introns or inteins. They specifically cleave intron - or intein - alleles of their genes and, thereby, facilitate homing of the introns or inteins that encode them (reviewed in refs 1 , 2 ). Most homing enzymes belong to one of four protein families named according to the presence of one or two, often highly degenerate, LAGLIDADG motifs, a GIY-YIG motif ( 2 ), a His-Cys motif ( 3 ) or an H-N-H motif ( 4 , 5 ). The LAGLIDADG-like motif was first recognized in yeast maturases ( 6 ), and later found in endonucleases encoded by group I introns, archaeal introns, inteins and by separate open reading frames in yeast (reviewed in ref. 2 ). Four archaeal introns have been found which encode putative LAGLIDADG-type proteins, one in Desulfurococcus mobilis 23S rDNA ( 7 , 8 ), two in Pyrobaculum organotrophum 23S rDNA ( 9 ) and one in Pyrobaculum aerophilum 16S rDNA ( 10 ).

The archaeal introns all generate `bulge-helix-bulge' motifs at the exon-intron junctions of their RNA transcripts, that are cut by an archaeal-specific endonuclease ( 8 , 11 - 14 ). The larger RNA introns (600-700 nucleotides), which encode the LAGLIDADG-type proteins, circularize and generate stable structures ( 8 , 9 , 14 ). The rDNA intron of D.mobilis encodes I- Dmo I, which has been shown to cleave intron - DNA in vitro ( 15 ). Furthermore, it has been demonstrated that the intron can move inter-cellularly, and home, in a Sulfolobus acidocaldarius culture, conferring a selective advantage on the intron + cells ( 16 ). I- Por I from intron 1 of the rDNA of P.organotrophum is also a homing-type enzyme capable of cleaving intron - DNA in vitro , whereas the LAGLIDADG-type protein encoded by intron 2 of the rDNA of the same organism exhibits no DNA cleavage activity in vitro ( 17 ). It remains to be investigated whether the LAGLIDADG-type protein encoded by the P.aerophilum 16S rDNA intron is an endonuclease. Three archaeal inteins which contain the LAGLIDADG-like motif have been described ( 18 , 19 ). At least one of these, PI- Tli I, shows site-specific endonuclease activity ( 18 ).

LAGLIDADG-type homing enzymes cleave intron - /intein - alleles near the site of intron/intein-insertion, generating 3'-overhangs of 4 nucleotides with 5'-phosphates. They are highly specific for their cleavage sites, recognizing generally non-palindromic DNA sequences of ~20 bp ( 2 ). However mutational analyses have shown that limited sequence redundancy is tolerated ( 17 , 20 - 24 ). The functional domain structures of these proteins are less well understood. Mutagenesis of the LAGLIDADG motifs, have shown that these are involved in catalysis but not in DNA binding ( 25 , 26 ). No other conserved sequences have been recognized in these endonucleases, and it it is still not known which parts of the proteins participate in the binding of the DNA substrate. Therefore, we have employed a proteolytic protein footprinting approach on I- Por I and I- Dmo I to identify regions involved in DNA binding. This method has previously been used succesfully to map amino acid sequences involved in protein-RNA and protein-protein interactions ( 27 , 28 ). The approach involves limited proteolysis of the endonucleases, specifically labeled at the N- or C-terminus, in the absence or presence of their DNA ligand. The data indicate that the homing enzymes consist of two structurally similar, and tandemly repeated, domains, each containing a LAGLIDADG motif and two potential DNA binding regions. Sequence alignment with other archaeal LAGLIDADG-type proteins revealed a conserved sequence, which may constitute a novel DNA binding motif.


Figure 1 . Protein constructs used in this study. Schematic representation of the GST fusion proteins HTG- Por I, GTH- Por I, HTG- Dmo I and GTH- Dmo I, expressed from vectors pET-HTG- Por I, pGEX-GTH- Por I, pET-HTG- Dmo I and pGEX-GTH- Dmo I, respectively (see Materials and Methods). The endoproteinase thrombin recognition sites (LVPRGS) are indicated by crosshatched boxes and the arrows denote sites of thrombin cleavage. Sites for phosphorylation with heart muscle kinase (RRASV) are indicated with a black box and a 32 P. The GST tags of the fusion proteins, which were removed in the footprinting analyses, are indicated with dashed lines.

MATERIALS AND METHODS

Construction of vectors for protein footprinting

In order to generate fusion protein expression vectors for the two LAGLIDADG-type proteins, the polymerase chain reaction (PCR) was performed on a plasmid containing intron 1 of P.organotrophum 23S rDNA ( 14 ), using oligodeoxynucleotide primers 5'-GAGGATCCATGGATATATTCCAGTATG and 5'-GAGAATTCCGAGGTCAAGATAATGGC, and on a plasmid containing the D.mobilis rDNA intron ( 8 ), using oligodeoxynucleotide primers 5'-GAGGATCCATGCATAATAATtoGAGAATG and 5'-GAGAATTCCCTCGGGGGGCAGGGGGTT. These primers generated PCR products with Bam HI and Eco RI restriction sites at the upstream and downstream ends, respectively. The PCR products were cleaved with Bam HI and Eco RI (intron 1 of P.organotrophum was cleaved partially with Eco RI to avoid an internal site), and the resulting DNA fragments of 522 bp (I- Por I) and 564 bp (I- Dmo I) were purified from a 1.5% agarose gel, and ligated with Bam HI/ Eco RI-cleaved pET-HTG and pGEX-GTH vectors, described in Jensen et al. ( 29 ), yielding pET-HTG- Por I, pGEX-GTH- Por I, pET-HTG- Dmo I and pGEX-GTH- Dmo I.

Endonuclease substrates

Cleavage assays were performed using pUC-P.isl and pUC-D.muc as substrates for I- Por I and I- Dmo I, respectively. pUC-P.isl contains a 205 bp PCR fragment of P.islandicum 23S rDNA [positions 1955-2159, D. mobilis numbering ( 30 )], and pUC-D.muc carries a 245 bp PCR fragment of D.mucosus 23S rDNA [positions 1915-2159, D.mobilis numbering ( 30 )], both inserted into the Hin cII site of pUC19. The rDNA sequences of P.islandicum and D.mucosus do not contain introns, but exhibit the same rDNA sequences in the vicinity of the intron-insertion sites as P.organotrophum and D.mobilis , respectively. Synthetic 25 bp DNA substrates were used in the protein footprinting experiments. They were generated by mixing two complementary 25 nucleotide oligodeoxynucleotides at 20 [mu]M in 10 mM Tris-HCl (pH 8.0), 100 mM KCl, heating at 95oC for 30 s, followed by incubation at 65oC for 5 min and slow cooling to room temperature. The synthetic oligodeoxynucleotides used were 5'-GCGAGCCCGTAAGGGTGTGTACGGG and 5'-GCCTTGCCGGGTAAGTTCCGGCGCG and their complementary sequences, for I- Por I and I- Dmo I respectively.


Figure 2 . Endonuclease cleavage assays. I- Por I and I- Dmo I, expressed from pET-HTG and pGEX-GTH derivatives (Fig. 1), were tested for endonuclease activity before and after removal of the GST tag by endoproteinase thrombin cleavage (see Materials and Methods). The endonucleases were incubated with DNA fragments containing the cleavage sites, and the products were separated on a 1.2% agarose gel and visualized with ethidium bromide. Lanes 1-5: Pvu II cleaved pUC-P.isl DNA incubated alone (lane 1) or with fusion proteins HTG- Por I (lane 2) and GTH- Por I (lane 3), or with HTG- Por I (lane 4) and GTH- Por I (lane 5) after removal of the GST tag. Lanes 6-10: Pvu II cleaved pUC-D.muc incubated alone (lane 6) or with fusion proteins HTG- Dmo I (lane 7) and GTH- Dmo I (lane 8), or with HTG- Dmo I (lane 9) and GTH- Dmo I (lane 10) after removal of the GST tag. The lower band generated in the I- Dmo I cleavages (lanes 9 and 10) is barely visible in the reproduction. Owing to the reduced thrombin cleavage efficiency of GTH- Por I, this protein was only partially cleaved in this experiment and, therefore, only partial DNA substrate cleavage was observed (lane 5). In the following experiments we invariably used fully cleaved GTH- Por I, obtained by increasing the thrombin concentration and the incubation time (see Materials and Methods).

Preparation of GST-endonuclease fusion proteins

Overnight cultures of E.coli BL21/DE3 containing plasmids pET-HTG- Por I, pGEX-GTH- Por I, pET-HTG- Dmo I or pGEX-GTH- Dmo I were diluted 1:50 in 500 ml LB media containing ampicillin at 100 [mu]g/ml, and incubated at 30oC to an A 600 value of ~0.8. Isopropyl-1-thio-[beta]-d-galactopyranoside was added to 0.5 mM, followed by a 2 h incubation. Cells were harvested and resuspended in 25 ml of PBS buffer [140 mM NaCl, 2.7 mM KCl, 10 mM Na 2 HPO 4 , 1.8 mM KH 2 PO 4 (pH 7.3)] containing 0.2 mM phenylmethylsulfonyl fluoride, 0.5 [mu]g/ml leupeptin and 2 [mu]g/ml aprotinin. After sonication, 2.5 ml 10% Triton X-100 was added, and cell debris was removed by centrifuging at 11 000 r.p.m. for 15 min. A volume of 250 [mu]l glutathione Sepharose (Pharmacia) was added to the supernatant, and it was incubated at room temperature for 1 h with gentle shaking. Sepharose beads were collected by centrifugation, washed five times with 4 ml PBS buffer, resuspended in 3.2 ml PBS buffer, 0.8 ml 80% glycerol and frozen at -80oC in aliquots of 100 [mu]l.

Labeling and thrombin cleavage of GST-endonuclease fusion proteins

Ten microlitres of Sephadex G75 (Pharmacia, 75 mg/ml in H 2 O) was added as carrier to 100 [mu]l fusion protein bound to glutathione Sepharose, prepared as described above. The mixture was washed three times with 1 ml of HMK buffer [20 mM Tris-HCl (pH 7.5), 100 mM NaCl, 12 mM MgCl 2 ], and resuspended in 100 [mu]l HMK buffer. Ten units bovine heart muscle kinase (Sigma) and 33 [mu]Ci [[gamma]- 32 P]ATP (ICN, 7000 Ci/mmol) was added, and the mixture was incubated at room temperature for 1 h with gentle shaking. Sepharose beads were washed five times with 1 ml of 10 mM Tris-HCl (pH 8.0), 50 mM KCl, resuspended in 80 [mu]l elution buffer [20 mM glutathione, 100 mM Tris-HCl (pH 8.0), 120 mM NaCl], and incubated at room temperature for 30 min with gentle shaking. Beads were removed by centrifugation, 0.2 U thrombin (Sigma) was added to the supernatant and the mixture was incubated at 4oC for 2 h. Fusion protein from the pGEX-GTH- Por I vector needed 2.0 U thrombin for 64 h to remove the GST-tag. Bovine serum albumin (BSA) and Triton X-100 were added to final concentrations of 0.5 mg/ml and 0.1%, respectively. The mixture was applied to a 1 ml Sephadex G75 gel filtration column pre-equilibrated with, and run in, TKT buffer [10 mM Tris-HCl (pH 8.0), 100 mM KCl, 0.1% Triton X-100] and 20 [mu]l fractions were collected. Fractions containing full length endonuclease were pooled, adjusted to 35% glycerol, and stored at -20oC.


Figure 3 . Footprinting of I- Por I in the presence of the DNA substrate. I- Por I labeled at ( A ), the N-terminus (HTG- Por I) and ( B ), the C-terminus (GTH- Por I) was incubated with a control (non cognate) 25 bp DNA fragment (-) or with its cognate 25 bp DNA substrate (+), and probed with different proteinases. Protein fragments were then separated on SDS/polyacrylamide gels, which were subjected to autoradiography. Assignments of bands generated by the site specific proteinases are shown. Reduced proteinase cleavages in the presence of DNA substrate are indicated by closed arrows. Proteinase abbreviations are -, no proteinase; R, endoproteinase Arg-C; K, endoproteinase Lys-C; Tr, trypsin; D, endoproteinase Asp-N; E, endoproteinase Glu-C; Ch, chymotrypsin; Pn, pronase; PK, proteinase K; Th, thermolysin; B, bromelain. Major artefact bands are indicated by asterisks and are discussed in the text. Some breakdown products were generated in the absence of proteinases (see no proteinase lanes, -). However, an estimated fraction of >95% of labeled protein was intact before proteolysis in each experiment.

Endonuclease assays

I- Por I and I- Dmo I were mixed with Pvu II cleaved pUC-P.isl and Pvu II cleaved pUC-D.muc, respectively, in 10 [mu]l binding buffer [50 mM HEPES-KOH (pH 8.0), 100 mM KCl, 1 mM DTT, 2% glycerol, 0.05% Triton X-100]. MgCl 2 was added to 10 mM and reaction mixes were incubated at 80oC (I- Por I) or 65oC (I- Dmo I) for 15 min. DNA was precipitated with ethanol, redissolved in glycerol loading buffer [10 mM Tris-HCl (pH 8.0), 0.1 mM EDTA, 5% glycerol, 0.05% bromophenol blue] and run on an agarose gel.

Proteinase footprinting

About 20 000 d.p.m. 32 P-labeled endonuclease (~1 pmol) was incubated in a siliconized micro-titer plate with 20 pmol of the 25 bp DNA substrate in 10 [mu]l binding buffer containing 0.2 [mu]g/[mu]l BSA at 65oC for 5 min. Samples were transferred to a 50oC bath and incubated for 5 min, followed by addition of 10 [mu]l of proteinase in binding buffer. After incubation at 50oC for 15 min, 4 [mu]l of 100 mM Tris-HCl (pH 8.3), 4% sodium dodecylsulfate (SDS), 50 mM DTT, 0.2% bromophenol blue, 40% glycerol was added, and the samples were denatured at 95oC for 2 min, and run on a 20 * 40 * 0.04 cm 7% stacking/20% separation polyacrylamide/SDS-Tricin gel ( 31 ). Gels were subjected to autoradiography.

RESULTS

Generation of active end-labeled endonucleases


Figure 4 . Footprinting of I- Dmo I in the presence of the DNA substrate. I- Dmo I labeled at ( A ), the N-terminus (HTG- Dmo I) and ( B ), the C-terminus (GTH- Dmo I) was incubated with an unspecific 25 bp DNA fragment (-) or with its cognate 25 bp DNA substrate (+), and probed with different proteinases as described for I- Por I in Figure 3. Protein cleavages that were reduced or enhanced, in the presence of DNA substrate, are indicated by closed and open arrows, respectively. Major artefact bands are indicated by asterisks and are discussed in the text. Some breakdown products were generated in the absence of proteinases (see no proteinase lanes, -). However, an estimated fraction of >95% of labeled protein was intact before proteolysis in each experiment.


To facilitate identification of proteolytic cleavages, obtained in the protein footprinting approach, amino-terminal (N-terminal) and carboxy-terminal (C-terminal) 32 P-labeled endonucleases were generated. Open reading frames from the P.organotrophum and D.mobilis introns, encoding I- Por I and I- Dmo I, respectively, were cloned into the vectors pGEX-GTH and pET-HTG designed for bacterial expression ( 29 ). Proteins expressed from these vectors (denoted HTG- Por I, GTH- Por I, HTG- Dmo I and GTH- Dmo I), contain a glutathione S-transferase (GST) tag of 220 amino acids at the C-terminus (HTG- Por I and HTG- Dmo I) or the N-terminus (GTH- Por I and GTH- Dmo I), which enables one to purify the fusion proteins on a glutathione Sepharose matrix in one step and, at the opposite end of the protein, a heart muscle kinase site (RRASV), enabling the ends of the protein to be labeled using heart muscle kinase and [[gamma]- 32 P]ATP (Fig. 1 ). The rationale for placing the GST tag and the heart muscle kinase site at opposite ends is that 32 P-labeled degradation products of the fusion proteins are not co-purified with the full length protein on the glutathione Sepharose matrix ( 29 ). After purification, the GST tag is removed by cleavage with the endoproteinase thrombin at a specific amino acid sequence (LVPRGS) inserted between the GST tag and the fused protein (Fig. 1 ).

The four GST-endonuclease fusion proteins were expressed in Escherichia coli , purified on glutathione Sepharose, and phosphorylated at the heart muscle kinase sites. The phosphorylated fusion proteins were checked for DNA cleavage activity on their cognate target sites in the presence and absence of the GST-tags (Fig. 2 ). For all four endonucleases, the GST fusion proteins showed very low cleavage activity (Fig. 2 , lanes 2, 3, 7 and 8), while removal of the GST-tag reactivated the enzymes (Fig. 2 , lanes 4, 5, 9 and 10). Therefore, we used only tag-less full length endonucleases in the following experiments.

Footprinting the DNA binding domains using proteinases

Protein domains involved in DNA binding are likely to become less susceptible to proteinase digestion upon DNA binding. In order to identify the DNA binding domains of I- Por I and I- Dmo I, N- and C-terminal 32 P-labeled enzymes were incubated at 65oC, in the absence of Mg 2+ , with 25 bp DNA substrates containing the respective cleavage sites. Co-precipitation experiments, using biotinylated DNA substrate on streptavidin coated magnetic beads, showed that under these conditions the endonucleases bind to, but do not cleave, their substrates (data not shown). The temperature was reduced to 50oC and the accessibility of the protein polypeptide chain was assesed by probing with 10 proteinases of different specificities (Table 1 ). As a negative control, for each experiment, the substrate was replaced by a similar sized unrelated DNA fragment. Controls with no DNA gave proteinase cleavages identical to the controls containing unrelated DNA showing that only the cognate DNA bound to the protein (data not shown). Protein fragments were separated on SDS/polyacrylamide gels, which were subjected to autoradiography (Figs 3 and 4 ). The bands were assigned to specific amino acids on the basis of band patterns generated with the site-specific proteinases. As a test for correct assignment, the mobilities of the protein fragments were plotted against the logarithm of their molecular weights, and they generally produced a straight line for masses above 5 kDa (Fig. 5 ), in good agreement with earlier reports ( 27 , 31 ).


Figure 5 . Calibration curves for the footprinting gels. Migration distances (from the bottom of the stacking gel) of the assigned fragments, generated by specific proteinases as shown in Figures 3 and 4, are plotted against the logarithm of their estimated molecular weights of N- and C-terminally labeled I- Por I [( A ) and ( B ), respectively] and N- and C-terminally labeled I- Dmo I [( C ) and ( D ), respectively].


Table 1 . Specificity and concentration range of each proteinase used
Proteinase

Specificity a

Final concentration (ng/[mu]l)

Endoproteinase Arg-C

R -

5.0

Endoproteinase Lys-C

K -

5.0

Trypsin

R - , K -

0.1-0.5

Endoproteinase Asp-N

- D

2.0

Endoproteinase Glu-C (V8)

E - > D -

25

Chymotrypsin

F - , Y - , W - > L - , M - , A -

0.25-0.5

Pronase

Unspecific

0.25

Proteinase K

- Hydrophobic

0.025-0.05

Thermolysin

- L, - F, - I, - V, - M, - A

0.25

Bromelain

Unspecific

0.5-1

a A dash indicates the position of cleavage relative to the adjacent amino acid.


Figure 6 . Summary of the footprinting data. Amino acid sequences of ( A ), I- Por I and ( B ), I- Dmo I. Protein footprinting data are superimposed on the sequences; results from N-terminally and C-terminally labeled proteins are given above and below the sequences, respectively. Filled and open circles indicate sites of strong and weak cleavages, respectively. Filled arrows denote sites of reduced cleavage in the presence of DNA substrate and open arrows show sites of enhanced cleavage. Letters beneath or above the circles and arrows designate the proteinases responsible for the cleavages as defined in the legend to Figure 3. A dashed line indicates that the exact identification of the cleavage site was not possible in this region. Cleavage sites for unspecific proteinases (Pn, PK, Th, and B) were estimated approximately from the calibration curves (Fig. 5). LAGLIDADG motifs are boxed. The data summarizes at least five independent sets of experiments of which only one set is shown in Figures 3 and 4.


The assigned proteinase cleavage sites on I- Por I and I- Dmo I are summarized in Figure 6 . The cleavages generally cluster in proteinase sensitive regions interrupted by relatively resistant regions, probably reflecting exposed and buried peptide segments of the native protein, respectively. The main proteinase sensitive regions were located at ~0-20 and 40-60 amino acids C-terminal to each of the LAGLIDADG motifs of I- Por I (Figs 3 and 6 A). The same pattern was observed after the second motif of I- Dmo I, whereas the region following the first motif was generally more accesible (Figs 4 and 6 B). The advantage of using both N- and C-terminally labeled proteins is the ability to distinguish primary proteolytic cleavages (initial cleavages) from secondary ones (cleavages depending on an initial cleavage event). For I- Por I and I- Dmo I we generally observed similar cleavage patterns using both N- and C-terminally labeled proteins, implying that the proteolytic cleavages are primary events under the conditions used (Fig. 6 ). For both proteins, however, the criteria for primary cleavage was difficult to evaluate near the termini, due to the lack of resolution (Figs 3 , 4 and 6 ).

A characteristic subset of the proteolytic cleavages became specifically inhibited when adding the target DNA substrate, suggesting that these sites are involved in DNA binding. I- Por I was protected against cleavage upon DNA binding at Y19 (chymotrypsin), near V21 (bromelain), at R57, R58, R60 (Arg-C, trypsin), R100 (Arg-C), F104 (chymotrypsin) and K135 (Lys-C, trypsin) (Figs 3 and 6 A). No enhanced proteinase cleavages were observed in I- Por I upon DNA binding. I- Dmo I was protected upon DNA binding, at amino acids K26/28/30, (Lys-C, exact identification not possible), R157 (trypsin), Y161 (chymotrypsin), near Y161 (proteinase K) and L167 (pronase), and at D169 (Asp-N), K172 (Lys-C) and F173 (chymotrypsin). Enhanced cleavage occurred at E96 (Glu-C), D154/155 (Asp-N) and D169 (Glu-C) suggesting that I- Dmo I undergoes minor conformational changes upon DNA binding (Figs 4 and 6 B).


Figure 7 . Alignment of archaeal LAGLIDADG-type proteins. ( A ) The seven known archaeal LAGLIDADG-type proteins are aligned with respect to their LAGLIDADG-like motifs and their putative DNA binding regions. In the three intein encoded species, the most C-terminal putative DNA binding motif may be repeated two to three times as indicated in the alignment. Consensus sequences of the archaeal LAGLIDADG-type proteins and the eukaryotic LAGLIDADG-type endonucleases, given below the sequences, show amino acids occurring in at least 70% (bold face) and 50% (normal face) of the sequences and residues in the individual sequences, that match the motif, are given in bold face. Reduced and enhanced proteinase cleavages in I- Por I and I- Dmo I, upon DNA binding, are given as filled and open triangles, respectively. To the left of the sequences, in parentheses, is indicated whether the proteins show (+), do not show (-) or have not yet been tested for (?) DNA endonuclease activiy in vitro . N- and C-termini of the proteins are indicated by ( ) and ( ), respectively, and numbers before, between, and after, the sequences denote the number of amino acids omitted from the alignment. Proteins are: I- Dmo I-from the intron of D.mobilis 23S rDNA (7,8,15), I- Por I and Por-I2-from the first and the second intron of P.organotrophum 23S rDNA, respectively (9,17), Pae-I-from the intron of P.aerophilum 16S rDNA (10), Psp-PI-intein of Pyrococcus species strain GB-D DNA polymerase (19), and Tli-PI1 and PI- Tli I-first and second intein of Thermococcus litoralis DNA polymerase, respectively (18). ( B ) Model of archaeal homing enzymes, showing the proposed two domain structure and the positions of the LAGLIDADG like motifs (crosshatched boxes) and the putative DNA binding regions (open boxes). In brackets are shown the possibility of additional DNA binding motifs only found in the intein-encoded proteins, and the distances between the conserved motifs are indicated by arrows below.

Occasionally some diffuse bands were observed when C-terminally labeled I- Por I or I- Dmo I was cleaved with Arg-C or trypsin, and when N- and C-terminally labeled I- Dmo I was cleaved with Lys-C, pronase, proteinase K and thermolysine (Figs 3 B and 4 ). These bands varied dramatically in position relative to other bands in individual gel runs and were, therefore, easily identified and disregarded. This phenomenon has been described earlier ( 27 , 31 ), and the bands most likely contain small labeled peptides that migrate with the SDS front of the stacking gel, which can give rise to aberrant bands in the separation gel.

DISCUSSION

The proteinase footprinting approach used in this study has proven a simple and effective method for determining ligand binding sites on proteins. Previously it was applied successfully to studying the protein structure of the HIV-1 Rev protein and its interactions with the RNA substrate, cognate monoclonal antibodies and cellular proteins ( 27 , 28 ). Here we have invoked this approach to establish the DNA binding regions of two intron-encoded homing enzymes, I- Por I and I- Dmo I from hyperthermophilic archaea.

I- Por I and I- Dmo I were probed with 10 different proteinases, either alone or bound to their DNA substrates. The proteinase cleavages clustered, most clearly for I- Por I, within proteinase sensitive regions ~0-20 and 40-60 amino acids after each of the LAGLIDADG motifs, interrupted by a relatively proteinase resistant region of ~20 amino acids (Fig. 6 ). In the presence of the DNA substrate, a similar protection pattern occurred relative to the position of the two LAGLIDADG motifs in both proteins. For I- Por I protections were observed at a distance of 4-10 and 41-45 amino acids C-terminal to both of the LAGLIDADG motifs (Figs 6 A and 7 A). In I- Dmo I protections were observed six amino acids after the first LAGLIDADG motif and 39-55 amino acids after the second motif. The protected regions are likely to be involved directly in DNA binding, although some of the protections may be caused by conformational changes induced by the DNA. In addition, enhancements were observed 74 amino acids after the first, and 36 and 51 amino acids after the second LAGLIDADG motif (Figs 6 B and 7 A).

Almost all proteins belonging to the LAGLIDADG family, contain two copies of the LAGLIDADG motif, and it has previously been noted that the distance between the two motifs is similar to the distance between the second motif and the C-terminus of the protein ( 32 ). An alignment of the regions C-terminal to the LAGLIDADG motifs revealed some sequence similarities, coinciding with the regions affected by DNA binding (Fig. 7 A). The significance of this conservation is strengthened by sequence comparisons with other known archaeal LAGLIDADG-type proteins (Fig. 7 A). Each protein contains a motif with the consensus sequence K/R ,K/R-(3 aa)- Y/F -(6-7 aa)- K/R, E/D ,K/R (based on at least 70 and 50% identity at the bold face and normal face positions, respectively) located 40-68 amino acids after the first LAGLIDADG motif, and repeated 37-60 amino acids after the second motif. The only exception, for the latter, is the protein encoded by intron 2 of P.organotrophum 23S rDNA (Por-I2, Fig. 7 A), which exhibits no DNA endonuclease activity either in total cell extracts or when expressed in vitro ( 17 ). Furthermore, the intein-encoded LAGLIDADG-type proteins, which have extended C-terminal sequences as compared with the intron-encoded species, contain two to three copies of the putative DNA binding motif in their C-terminus (Fig. 7 ). A sequence search within the eukaryotic LAGLIDADG-type homing enzymes revealed no such motif in the corresponding region. Based on these results, we propose a putative DNA binding motif, K/R,K/R-(3 aa.)-Y/F-(6-7 aa.)-K/R,E/D,K/R, which, together with the LAGLIDADG motif, defines the borders of a repetitive domain in archaeal homing enzymes (Fig. 7 B).

The cleavage sites of >10 different LAGLIDADG homing enzymes have been characterized, and the general picture to emerge is that the recognition sequence spans ~20 bp of DNA, surrounding the centrally located 4 bp 3'-staggered cleavage site ( 17 , 20 , 23 , 24 , 33 , 34 ). Mutational studies have shown that the sequence recognition is flexible and that both sides of the cleavage sites are important ( 17 , 20 - 24 ). All the enzymes, except I- Cre I, contain non-palindromic recognition sequences with respect to their cleavage sites (ref. 2 and references therein). It is therefore conceivable that the homing enzymes have evolved to recognize the non-palindromic DNA substrate as monomeric enzymes, allowing each of the two repetitive domains to interact with one of the two halves, symmetrically positioned around the DNA cleavage site. A similar mode of binding to non-palindromic DNA substrates has earlier been demonstrated for the specificity subunit of type I-restriction enzymes ( 35 - 37 ). It is possible, that I- Cre I, which is exceptionally short compared with other eukaryotic homing enzymes and contains only one recognizable LAGLIDADG motif ( 38 ), is a one-domain protein, that binds as a homodimer to its palindromic substrate, as is the case for type II-restriction enzymes. During the evolution of the homing enzymes recombination of two different LAGLIDADG protein genes, or duplication of a single gene, could have given rise to monomeric enzymes. As a consequence, the requirement for symmetry at the intron homing site could be relieved, providing a biological advantage by expanding the possible homing site repertoire.

ACKNOWLEDGEMENTS

We thank Torben H. Jensen and Thomas Ø. Tange for technical advice on protein footprinting experiments. The work was supported in part by grants from Novo Nordic Fund and Danish Natural Science Research Council. J.L. was supported by Copenhagen University.

REFERENCES

1 Dujon, B. (1989) Gene, 82, 91-114. MEDLINE Abstract

2 Lambowitz, A. M. and Belfort, M. (1993) Annu. Rev. Biochem. 62, 587-622.

3 Johansen, S., Embley, T. M. and Willassen, N. P. (1993) Nucleic Acids Res., 21, 4405. MEDLINE Abstract

4 Shub, D. A., Goodrich-Blair, H. and Eddy, S. R. (1994) Trends Biochem. Sci., 19, 402-404.

5 Gorbalenya, A. E. (1994) Protein Sci., 3, 1117-1120.

6 Hensgens, L. A., Bonen, L., de Haan, M., van der Horst, G. and Grivell, L. A. (1983) Cell, 32, 379-389.

7 Kjems, J. and Garrett, R. A. (1985) Nature, 318, 675-677.

8 Kjems, J. Garrett, R. A. (1988) Cell, 54, 693-703.

9 Dalgaard, J. Z. and Garrett, R. A. (1992) Gene, 121, 103-110.

10 Burggraf, S., Larsen, N., Woese, C. R. and Stetter, K. O. (1993) Proc. Natl Acad. Sci. USA, 90, 2547-2550. MEDLINE Abstract

11 Thompson, L. D. and Daniels, C. J. (1988) J. Biol. Chem., 263, 17951-17959.

12 Thompson, L. D. and Daniels, C. J. (1990) J. Biol. Chem., 265, 18104-18111.

13 Kjems, J. and Garrett, R. A. (1991) Proc. Natl Acad. Sci. USA, 88, 439-443. MEDLINE Abstract

14 Lykke-Andersen, J. and Garrett, R. A. (1994) J. Mol. Biol., 245, 846-855.

15 Dalgaard, J. Z., Garrett, R. A. and Belfort, M. (1993) Proc. Natl Acad. Sci. USA, 90, 5414-5417.

16 Aagaard, C., Dalgaard, J. Z. and Garrett, R. A. (1995) Proc. Natl Acad. Sci. USA, 92, 12285-12289. MEDLINE Abstract

17 Lykke-Andersen, J., Thi-Ngoc, H. P. and Garrett, R. A. (1994) Nucleic Acids Res., 22, 4583-4590.

18 Perler, F. B., Comb, D. G., Jack, W. E., Moran, L. S., Qiang, B., Kucera, R. B., Benner, J., Slatko, B. E., Nwankwo, D. O., Hempstead, S. K., Carlow, C. K. S. and Jannasch, H. (1992) Proc. Natl Acad. Sci. USA, 89, 5577-5581.

19 Xu, M.-Q., Southworth, M.W., Mersha, F.B., Hornstra, L.J. and Perler, F.B. (1993) Cell, 75, 1371-1377. MEDLINE Abstract

20 Colleaux, L., D'Auriol, L., Galibert, F. and Dujon, B. (1988) Proc. Natl Acad. Sci. USA, 85, 6022-6026. MEDLINE Abstract

21 Sargueil, B., Hatat, D., Delahodde, A. and Jacq, C. (1990) Nucleic Acids Res., 18, 5659-5665. MEDLINE Abstract

22 Wernette, C., Saldanha, R., Smith, D., Ming, D., Perlman, P. S. and Butow, R. A. (1992) Mol. Cell. Biol., 12, 716-723. MEDLINE Abstract

23 Marshall, P. and Lemieux, C. (1992) Nucleic Acids Res., 20, 6401-6407. MEDLINE Abstract

24 Schapira, M., Desdouets, C., Jacq, C. and Perea, J. (1993) Nucleic Acids Res., 21, 3683-3689. MEDLINE Abstract

25 Gimble, F. S. and Stephens, B. W. (1995) J. Biol. Chem., 270, 5849-5856.

26 Henke, R. M., Butow, R. A. and Perlman, P. S. (1995) EMBO J., 14, 5094-5099.

27 Jensen, T. H., Leffers, H. and Kjems, J. (1995) J. Biol. Chem., 270, 13777-13784.

28 Tange, T. Ø., Jensen, T.H. and Kjems, J. (1996) J. Biol. Chem., 271, 10066-10072. MEDLINE Abstract

29 Jensen, T. H., Jensen, A. and Kjems, J. (1995) Gene, 162, 235-237.

30 Leffers, H., Kjems, J., Østergaard, L., Larsen, N. and Garrett, R. A. (1987) J. Mol. Biol., 195, 43-61. MEDLINE Abstract

31 Schägger, H. and von Jagow, G. (1987) Anal. Biochem., 166, 368-379.

32 Michel, F. and Cummings, D. J. (1985) Curr. Genet., 10, 69-79. MEDLINE Abstract

33 Wenzlau, J. M., Saldanha, R. J., Butow, R. A. and Perlman, P. S. (1989) Cell, 56, 421-431.

34 Thompson, A. J., Yuan, X., Kudlicki, W. and Herrin, D. L. (1992) Gene, 119, 247-251.

35 Fuller-Pace, F. V. and Murray, N. E. (1986) Proc. Natl Acad. Sci. USA, 83, 9368-9372.

36 Gubler, M., Braguglia, D., Meyer, J., Piekarowitz, A. and Bickle, T. A. (1992) EMBO J., 11, 233-240. MEDLINE Abstract

37 Kneale, G. G. (1994) J. Mol. Biol. 243, 1-5.

38 Dürrenberger, F. and Rochaix, J.-D. (1991) EMBO J., 10, 3495-3501.


Return

* To whom correspondence should be addressed
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
G. H. Silva and M. Belfort
Analysis of the LAGLIDADG interface of the monomeric homing endonuclease I-DmoI
Nucleic Acids Res., June 9, 2004; 32(10): 3156 - 3168.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P. Haugen and D. Bhattacharya
The spread of LAGLIDADG homing endonuclease genes in rDNA
Nucleic Acids Res., April 6, 2004; 32(6): 2049 - 2057.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
A. Bakhrat, M. S. Jurica, B. L. Stoddard, and D. Raveh
Homology Modeling and Mutational Analysis of Ho Endonuclease of Yeast
Genetics, February 1, 2004; 166(2): 721 - 728.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
V. Pingoud, H. Thole, F. Christ, W. Grindl, W. Wende, and A. Pingoud
Photocross-linking of the Homing Endonuclease PI-SceI to Its Recognition Sequence
J. Biol. Chem., April 9, 1999; 274(15): 10235 - 10243.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
F. S. Gimble, X. Duan, D. Hu, and F. A. Quiocho
Identification of Lys-403 in the PI-SceI Homing Endonuclease as Part of a Symmetric Catalytic Center
J. Biol. Chem., November 13, 1998; 273(46): 30524 - 30529.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (206K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (24)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Lykke-Andersen, J
Right arrow Articles by Kjems, J
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lykke-Andersen, J
Right arrow Articles by Kjems, J
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?