ABSTRACT
The three dimensional crystal structure of T5 5'-3' exonuclease was compared with that of two other members of the 5'-3' exonuclease family: T4 ribonuclease H and the N-terminal domain of Thermus aquaticus DNA polymerase I. Though these structures were largely similar, some regions of these enzymes show evidence of significant molecular flexibility. Previous sequence analysis had suggested the existence of a helix-hairpin-helix motif in T5 exonuclease, but a distinct, though related structure is actually found to occur. The entire T5 exonuclease structure was then compared with all the structures in the complete Protein Data Bank and an unexpected similarity with gamma-delta ([gamma][delta]) resolvase was observed. 5'-3' exonucleases and [gamma][delta] resolvase are enzymes involved in carrying out quite different manipulations on nucleic acids. They appear to be unrelated at the primary sequence level, yet the fold of the entire catalytic domain of [gamma][delta] resolvase is contained within that of the 5'-3' exonuclease. Different large-scale helical structures are used by both families to form DNA binding sites.
[gamma][delta] resolvases and T5 5'-3' exonuclease are enzymes involved in the processing of nucleic acids. No significant amino acid similarity has been reported for these enzymes and they fulfil very different primary roles, in site-specific recombination and replication, respectively. Both types of enzyme bind to DNA and cut internucleotidic phosphodiester bonds in transesterification reactions (1 ,2 ). However, while T5 exonuclease simply hydrolyses the nucleic acid substrate, [gamma][delta] resolvase first forms a covalent protein-DNA adduct via a phosphoserine intermediate which is then attacked by an incoming 3'-hydroxyl group leading finally to strand exchange.
Resolvases bind a cointegrate, a compound plasmid carrying directly repeated copies of a transposon, and catalyse both cleavage and subsequent religation events. The process liberates the original transposon-carrying plasmid and the target plasmid complete with inserted daughter transposon (3 ). The 5'-3' exonucleases such as those encoded by bacteriophages T5 and T7, and the small fragment of eubacterial DNA polymerase I homologues are capable of structure-specific endonucleolytic cleavage (4 ,5 ). They are also able to release nucleotides exonucleolytically from DNA substrates containing free 5'-ends. Bifurcations, or flap structures, with displaced 5'-single stranded tails are cleaved close to the flap junction. Thus, the resolvases act in a sequence-specific manner requiring correctly oriented copies of their cognate res sites in order to function whereas T5 exonuclease requires only a free 5'-end and a divalent cofactor for catalysis (2 ,3 ,6 ).
The program PROTEP (7 ) was used to compare the structure of T5 exonuclease [PDB code 1EXN, (4 )] with the other structures in the Protein Data Bank (8 ,9 ). The search revealed the expected similarities between T5 5'-3' exonuclease and the two other related 5'-3' exonucleases whose structures are known: Taq polymerase 5'-3' exonuclease domain [PDB code 1TAQ, (10 )] and T4 RNase H [PDB code 1TFR, (11 )] (Fig. 1 ). This search also detected a less extensive but very significant similarity between the catalytic domains of T5 exonuclease and that of [gamma][delta] resolvase [2RSL and 1GDT, (12 ,13 )].
Primary sequence alignments were performed using the GAP program (14 ). Three dimensional structure comparisons were performed using the PROTEP program (7 ) which uses a maximal common subgraph algorithm (15 ) to compare protein structures represented as linear helices and strands in three dimensions. The structure of T5-exonuclease was compared with all unique protein structures deposited in the May 1997 release of the Protein Data Bank (a subset of 4300 structures) (8 ,9 ). Regions of helix and strand in the compared structures were assigned using the algorithm of Kabsch and Sander (16 ). The position and direction of each helix or strand was represented by a vector in three dimensions (the nodes of the graph) and the distances and torsional angles between them calculated (the edges of the graph) and stored in a database as a labelled graph.
The three-dimensional structures of the three 5'-3' exonucleases are shown in Figure 1 . They show good agreement in their N-terminal catalytic domains (143 core C[alpha] atoms from T4 RNase H superimpose on T5 exonuclease with an RMSD of 1.64 Å; 110 core C[alpha] atoms from Taq superimpose with an RMSD of 1.55 Å). The only significant differences between these structures are in disordered regions not clearly determined in the X-ray structures and in a small region located toward the C-terminus. This region is poorly conserved at the sequence level (17 ). Sequence comparisons showed that the T4 and Taq enzymes were 20% identical and 48% similar, the T5 and Taq sequences showed 26% identity and 54% similarity and comparison of the T5 and T4 enzyme sequences showed their degree of similarity to be 45% (23% identical).
The T5 exonuclease structure (4 ) contained two molecules in the asymmetric unit with near crystallographic symmetry. This deviation from perfect symmetry was first manifested by the observation that 2-fold averaged maps were substantially degraded when a single non-crystallographic symmetry operator was used in the averaging process. By allowing independent operators in the averaging process a substantial improvement in the maps was observed. The regions that appear to be somewhat independent are in four parts of the molecule, and include the following residues. In region 1: 21-69; 92-198; 214-231; 253-288; region 2: 71-90; region 3: 200-212; and region 4: 233-250. This is shown in Figure 1 d as a colour coded ribbon diagram for the four regions. The main part of the molecule consisting of the [beta]-sheet and the surrounding helices contains the region which is structurally similar to [gamma][delta] resolvase. The three other regions of T5 exonuclease which do show conformational variability are the helical arch, which shows the greatest deviation, the `finger' which touches the arch (residues 200-212) and the region at the base of the finger consisting of residues 233-250. These regions seem to move in tandem, and are dependent on the position of the helical arch. The arch shows the greatest deviation, followed by the finger region, and the 233-250 region shows the smallest deviation.
Table 1
Homologous regions of T5 exonuclease and Taq Pol located near the C-termini of the nucleases were predicted to adopt a helix-hairpin-helix structure by Ponting and co-workers (18 ). However, as can be seen in Figure 2 , this does not appear to be the case in the crystal structure reported for T5 5'-3' exonuclease. This structure does show some similarity with the HhH motif except that in place of a hairpin a small loop (residues 200-211) has been incorporated. The end of the loop contains one of the conserved aspartate residues (D204) known to be involved in cofactor binding. This region overlaps completely with the finger region referred to above. No clearly analogous structure could be identified in the Taq Pol structure.
The results of the structural similarity search are shown in Table 1 . Though elements of similarity between the structure of T5 exonuclease and other (non-homologous) proteins were noted, it was found to most closely resemble the catalytic domain of [gamma][delta] resolvase. The arrangement of [beta]-strands in this domain is related to that of the Rossmann fold, and at a somewhat lower level of significance the search shows that this region of the structure also resembles certain dehydrogenases and the DNA-binding HhaI methyltransferase. It is therefore possible that this resemblance may reflect a shared convergent elaboration of a common topology and may not be indicative of a divergent evolutionary relationship. However, the [gamma][delta] resolvase similarity involves the entire catalytic domain of [gamma][delta] resolvase, whilst the other non-exonuclease hits are not only at a lower level of significance as measured by number of secondary structure elements superposed but also involve only parts of those enzymes' respective domains.
The three representatives of the 5'-3' exonuclease family which have been structurally characterised share a significant degree of primary sequence similarity (20-26% identity) including all 10 absolutely conserved residues postulated as defining a prokaryotic 5'-3' exonuclease motif sequence (17 ). There are differences between these three structures, most notably at the C-terminus, where the sequence similarity is poorest. In the region described as a helical arch in the T5 exonuclease (4 ) the equivalent regions in the other two nucleases are disordered. One unresolved question is the reason for this disorder in two of the three members of this family and whether it is functionally significant.
In an analysis using pre-release structural data relating to Taq Pol, Ponting and co-workers (18 ) proposed that it and T5 exonuclease should contain the HhH motif described originally by Thayer et al. (22 ). This motif is implicated in DNA binding. However, the HhH domain was not present in the three crystal structures examined. T5 exonuclease does contain a related structure, similar to the HhH domain, but with a short loop of 10 amino acids replacing the proposed hairpin. As can be seen from Figure 2 the helices of the HhH domain do superimpose very well on the helix-loop-helix observed in T5 exonuclease.
The differences between Taq polymerase, T4 RNase H and T5 exonuclease, and the high temperature factors observed in the C-terminal region of the 5'-3' exonuclease domain of Taq polymerase may be viewed as corroborating evidence for innate structural flexibility in the 5'-3' exonuclease family. Alternatively, this may reflect genuine structural differences in regions of low sequence similarity within the 5'-3' exonuclease family.
The observation that the catalytic domains of 5'-3' exonucleases possess a high degree of structural similarity to that of [gamma][delta] resolvase was totally unexpected. The structurally related unit involves the catalytic, rather than the DNA-binding, domain of [gamma][delta] resolvase. The resolvases act in a sequence specific manner while the 5'-3' exonucleases show no sequence specificity per se, thus, their major biochemical similarity lies in the ability to bind DNA and in the transesterification reaction.
The 5'-3' exonucleases recognise certain DNA structures and require only a free 5'-end. The resolvases demonstrate a high degree of sequence specificity for their DNA-cleavage and rejoining reactions. This sequence specificity resides in domain three, totally separate from the catalytic domain. The specificity domain comprises a three helix bundle arranged in a helix-turn-helix motif (13 ). This region of [gamma][delta] resolvase does not show any significant structural similarity with the 5'-3' exonucleases. The structure of the [gamma][delta] resolvase recognition domain in complex with a 34 bp oligonucleotide shows that DNA contacts the catalytic domain of [gamma][delta] resolvase (13 ). The structural comparisons reported here raise the intriguing possibility that the catalytic domains of 5'-3' exonucleases and [gamma][delta] resolvase may employ related DNA-binding mechanisms. No structural information has yet been reported for a 5'-3' exonuclease-DNA complex but a threading model was proposed by the Suck laboratory (4 ) (Fig. 3 b).
In an analysis of the structure of the [gamma][delta] resolvase at 2.3 Å, Rice and Steitz observed that there were three slightly different conformations present in the crystal form they were studying (23 ). The major differences they observed were (i) in the twist of the [beta]-sheet, (ii) in the conformation of the loop between [beta]-strand 2 and [alpha]-helix B and (iii) the angle at which helix A packs against the rest of the protein. They concluded that the conformational flexibility of the molecule is likely to allow [gamma][delta] resolvase to form the synaptic complex (i.e., a multi-subunit complex) and to present DNA binding sites of different geometry for the catalysis of recombination.
The role of conformational flexibility in 5'-3' exonuclease function is still uncertain. There are two components to this flexibility. First the helical arch structure observed in the T5 nuclease was found to be disordered in the two other 5'-nuclease structures, and second there was an inherent molecular flexibility observed in the T5 nuclease structure (Fig. 1 d). This may allow processing of diverse flap structures produced during strand displacement synthesis. Since the duplex sequence at the point of bifurcation would vary, the local DNA structure would also vary. To allow binding and cleavage irrespective of nucleotide sequence a degree of flexibility in the nuclease molecule may be required.
It is clear that in both nuclease and resolvase structures the catalytic sites are positioned in the structurally conserved core regions. The reaction mechanisms of the exonucleases and the resolvases are quite different. The resolvase mechanism requires a first step involving cleavage of DNA strands and replacement of the phosphodiester bond by a phosphoserine bond between the 5' phosphate at the cleavage site and the hydroxyl of Ser10 in the resolvase. Subsequently strand exchange is effected by a second transesterification reaction leading to the reformation of the phosphodiester backbone. In contrast to the resolvase, the mechanism of the exonucleases requires the involvement of divalent metal ions (4 ,10 ,11 ) and while the active site residues are positioned in similar positions with respect to the common fold of the catalytic domain, there appears to be no immediately obvious chemical relationship between them. In particular, the initial cleavage reaction carried out by [gamma][delta] resolvase does not require a metal cofactor, though magnesium does stimulate the recombination step (19 ).
However, there are certain chemical similarities in that both enzymes cleave DNA and there is a similar transesterification step, to water or serine, involved in the mechanisms. The structural similarities may therefore indicate the existence of a remote evolutionary ancestor for the core domain of these two classes of enzymes. In this context it is most interesting to observe that the 5'-3' exonucleases exhibit ribonuclease H activity and this may constitute a direct link to their proposed common ancestral protein. One may speculate that this ancestral protein may have had a simple single-stranded nucleolytic function which has been retained, whilst the ability to bind double-stranded DNA has been added on in different ways in the two proteins: in [gamma][delta] resolvase by dimerisation to form a large [alpha]-helical structure and in the 5'-3' exonucleases by addition of a C-terminal region that binds the duplex part of the DNA. Thus, it seems possible that both molecules diverged from the common catalytic core domain butconvergent evolution is responsible for the superficially similar flexible arch-like structures used to bind DNA.
This work benefited from the use of the SEQNET facility at Daresbury Laboratory, UK. We thank the MRC for support to PJA. The Krebs Institute is a BBSRC Biomolecular Science Centre (PJA and JRS).
Protein name
Brookhaven ID codes
Maximum clique size
RMSD (Å)
Comments
T5 5'-3' exonuclease
1EXN
25
0.0
Self
T4 RNaseH
1TFR
18
3.1
Exceedingly similar
Taq DNA polymerase
1TAQ,1TAU
14
4.5
Very similar
[gamma][delta] resolvase
2RSL, 1GDT
9
3.3
Entire fold of calytic domain of GDR present
Biotin carboxylase
1BNC
9
5.3
Poor helix overlap
Aconitase
1ACOa
8
3.2
Part of N terminal domain of aconitase
HhaI methyltransferase
5MHT
8
3.3
Good overlap, DNA binding protein
Aldehyde dehydrogenase
1AD3
8
3.8
Increasingly poor overlaps
Glucose-fructose oxidoreductase
1OFG
8
3.8
Glycogen phosphorylase
1ABBa
8
4.0
Malate dehydrogenase
1BMD
8
4.1
UDP-galactose 4-epimerase
1UDP
8
4.3
Glycine N-methyltransferase
1XVA
8
4.5
REFERENCES

