ABSTRACT
The HMG box of human LEF-1 (hLEF-1, formerly TCF1
[alpha]
) has been expressed in four forms: a parent box of 81 amino acids and
constructs having either a 10 amino acid C-terminal extension, a 9 amino acid N-terminal extension, or both. These four species have been compared
for DNA binding and bending ability using a 28 bp recognition sequence from the
TCR
[alpha]
-chain enhancer. In the bending assay, whereas the parent box and that with
the N-terminal extension bent the DNA by 57/58
o
, the box extended at the C-terminus bent the DNA by 77/78
o
, irrespective of the presence or absence of the N-terminal extension. A 6-fold increase in DNA affinity also resulted from addition of both
terminal extensions. These observations redefine the functional boundaries of
the HMG box. The structure of a mouse LEF-1/DNA complex recently published [Love
et al.
(1995)
Nature
376, 791-795] implies that the higher DNA affinity and in particular the increased
bend angle observed are consequences, at least in part, of the C-terminal extension spanning the major groove on the inside of the DNA
bend.
The HMG box is a short DNA binding motif found in proteins from a wide range of
eukaryotes (for recent reviews see
2
,
3
). HMG boxes fall into two categories: those that recognize a specific DNA
sequence, e.g. the single boxes of LEF-1 (
4
,
5
) and SRY (
6
,
7
) and those that do not, e.g. the two boxes from HMG1 (
8
) and the six boxes of human upstream binding factor UBF (
9
). A common feature of all HMG boxes however is their ability to bend the DNA to
which they bind. In the case of sequence-specific HMG boxes this has been assessed using the circular permutation
gel retardation assay in which the recognition sequence is incorporated at
varying positions within DNA fragments of constant length (
10
-
14
). For non-sequence specific boxes, bending ability has been assessed indirectly
using the ligase-mediated circularization assay (
15
,
16
), by the ability to form DNA loops (
17
,
18
)
and by exploiting their structure specific binding to a fully defined
cis
-platinated DNA adduct (
19
). Some variation has been reported in the bend angle induced by a single HMG
box: for the non-specific box 2 of HMG1 the induced bending on binding to a 1.2 intrastrand
cis
-platin DNA adduct is ~70o (
19
). For sequence-specific boxes, reported values range from 130o for mouse LEF-1 (mLEF-1; see
11
) to 30o for human SRY (hSRY;
12
). A considerable variation in dissociation constants has also been observed for
HMG box binding to DNA: reported values range from 3 * 10
-11
M for mouse Sox-4 (
20
) to 1 * 10
-9
M for mLEF-1 (
21
) and mouse Sox-5 (
13
), and 2 * 10
-8
M for hSRY (
22
).
A feature of some importance is the precise length of the expressed HMG box
polypeptides in the DNA binding and bending assays. Several papers have shown
that extensions beyond the minimal HMG box lead to increased DNA affinity. In
the case of hLEF-1 (
23
), it was shown that addition of only six basic amino acids C-terminal to the minimal HMG box resulted in a gel retarded complex, whilst
the minimal box alone did not bind to the DNA recognition sequence under the
conditions used. Using the non-sequence specific HMG box of chironomous cHMG1 (
24
), it was demonstrated that inclusion of an additional 19 amino acids C-terminal to an 84-residue HMG box resulted in much enhanced binding to 4-way junction DNA. Teo
et al
. (
25
) compared a mammalian minimal HMG1 box 2 (designated B) with an extended form
(designated B') having 20 additional C-terminal and four additional N-terminal residues, both extensions being strongly basic.
These authors found that B' had increased affinity for 4-way junction DNA and supercoiled DNA, and moreover was more
effective than B in a supercoiling assay. They concluded that not only the DNA
binding but also the DNA bending activity is augmented by the inclusion of the
basic arms. A functional relevance of basic extensions to an HMG box has been
demonstrated in the case of the human mitochondrial transcription factor A (h-mtTFA;
26
). By the use of mutants and yeast/human chimeric molecules, a critical role was
shown in DNA recognition and transcription factor activation for both the 25-residue arm C-terminal to the second box and the interbox spacer.
In none of the above cases of adding additional sequences N- or C-terminal to the minimal HMG box was a change in induced DNA bend
angle measured. The present work uses the circular permutation assay with
several constructs of the sequence specific HMG box of hLEF-1 in order to measure directly the consequences for bending of the N- and C-terminal extensions. In parallel, the DNA binding affinities
of the constructs have also been measured.
Constructs hLEF-1 (285-374), hLEF-1 (294-384) and hLEF-1 (285-384) were produced by an extension PCR
reaction with the clone of the parent HMG box (294-374) as template. The PCR product was purified by preparative agarose gel
electrophoresis, ligated into the pGEX-2T vector DNA (
27
) and transformed into
Escherichia coli
BL21 (DE3) plysS. Dideoxynucleotide sequencing of both strands confirmed the
correct inserted DNA sequence. Proteins were expressed and purified as
described (
14
). Electrospray mass spectrometry confirmed that the correct proteins had been expressed and the following values were obtained: hLEF-1 (285-374) 10 997 +- 1 Da, expected 10 998 Da; hLEF-1 (294-384) 11 127 +- 2 Da, expected 11 128 Da; hLEF-1 (285-384) 12 321 +- 1 Da, expected
12 322 Da; hLEF-1 (294-374) 9806 +- 2 Da, expected 9805 Da.
The circular permutation assay was performed as described (
14
) using the pBend4 plasmid (
28
) with 100 nM of both DNA and protein. Gel retardation assays were also carried
out essentially as described (
14
). Varying amounts of HMG box proteins, quantified by UV spectroscopy, were
incubated with 1 [mu]g of poly [d(I-C)] in a 15 [mu]l reaction containing 12% glycerol, 10 mM HEPES (pH 7.9), 100 mM
KCl, 1 mM EDTA, 1 mM DTT, 0.1 mM PMSF and 0.33 mg/ml bovine serum albumin, for
10 min on ice. Labelled duplex 28 bp DNA was then added and the incubation
continued for a further 25 min at room temperature. Binding reactions were
resolved on a non-denaturing 7% polyacrylamide gel in 0.25* TBE. The sequence of the oligonucleotide (with the hLEF-1 binding site in bold) was 5'-GATCTAGGGCACC
CTTTGAA
GCTCTCCC-3'.
The minimal HMG box can be regarded as beginning 10 amino acids before the start of helix 1 and finishing at the end of helix 3. This N-terminal end corresponds precisely to the natural -terminus of
Drosophila
HMG-D (
29
) and the C-terminus to that of the second HMG box of yeast ABF2 (
30
). It represents 71 residues in the case of HMG1 box 2, hSRY and hLEF-1. In a previous study (
14
) we noted that an HMG box from hLEF-1 that started two residues N-terminal to and finished eight residues C-terminal to the above minimal box, exhibited a DNA bend angle
of only 52o, in contrast to the larger values measured for other sequence-specific HMG boxes, in particular mouse LEF-1 (
11
). We therefore expressed three variants of this previously studied 81 amino
acid parent HMG box from hLEF-1: with a 9-residue N-terminal extension, with a 10-residue C-terminal extension and with both extensions. DNA
bend angles were determined for the four peptides using a circular permutation
assay with restriction fragments cleaved from pBend4 incorporating a 28 bp
oligonucleotide duplex taken from the TCR [alpha]-chain enhancer (
5
) that includes the hLEF-1 recognition sequence 5'-CTTTGAA-3'. DNA binding affinities were compared in a gel
retardation assay using the same 28 bp oligonucleotide duplex.
Measurements of the bend angle generated by the four constructs are shown in
Figure
1
. A bend angle of 57-58o is calculated for the parent box and the box with the N-terminal extension. However, when the C-terminal extension is present, either alone or in
combination with the N-terminal extension, the bend angle increases to 77/78o. A 20o increase in the DNA bend angle (i.e. by one-third) is therefore induced by the presence of the C-terminal arm alone. No effect on bending was
observed by the addition of the N-terminal arm to the parent box, even in the presence of the C-terminal arm.
The solution structures of two complexes between HMG boxes and DNA have recently
been published: that from hSRY bound to 8 bp of DNA (
31
) and that of mLEF-1 bound to 15 bp of DNA (
1
). In both complexes the fold of the protein component does not differ markedly
from that previously determined for the non-specific HMG boxes as free protein (
32
-
34
). Human LEF-1 differs by only one amino acid from mouse LEF-1 in the HMG box and the polypeptide used in the structure
determination (
1
) is four residues shorter at the N-terminus than our parent box, but only two residues shorter at the C-terminus than the C-terminally extended box used here, i.e. it lacks the final LQ
dipeptide. The structure of the mLEF-1/15 bp DNA complex (
1
) offers an explanation of the bend angle changes shown in Figure
1
. In all sequence specific HMG boxes, helix 3 is interrupted by a proline at
position 69 with the result that the polypeptide chain turns through
approximately a right angle. A tyrosine located eight residues C-terminal to this proline in mLEF-1 (residue 372 in Fig.
1
), binds in the minor groove and serves, in part, to fix this change in
direction of the polypeptide chain. In the mLEF-1/DNA structure the protein chain is observed to continue across the major
groove on the inside of the DNA bend, so that an arginine residue (R378 in Fig.
1
) makes contact with the phosphodiester chain 7 bp away at the further end of
the duplex. This arginine is the 4th residue of the present C-terminal extension of hLEF-1. Thus the 10 additional C-terminal residues in this work correspond to those that span
the major groove in the mLEF-1 structure. In their absence a reduced bend angle is to be expected.
There are some differences in the reported bend angles generated by the LEF-1 HMG box when bound to the sequence TTCAAAG. Two different algorithms
have been widely used in calculating the DNA bend angles induced by HMG boxes.
That of Ferrari
et al
. (
10
) is based on the Levene and Zimm model (
35
) for the reptation of curved rods (DNA) through a gel, simplified by assuming a
single bend at a fixed point in the rod. This leads to a quadratic equation
relating the mobility of the permuted DNA/protein complexes (with respect to
the mobility of the free DNA) to the flexure displacement (the position of the
centre of the protein binding site with respect to an end). All data points are
used in fitting to the quadratic. The Thompson and Landy algorithm (
36
) relates the mobility of a complex having protein bound at the middle of the
DNA ([mu]
M
)-with respect to the mobility of the complex having the protein bound at
the end ([mu]
E
)-to the bend angle [alpha]: [mu]
M/
[mu]
E
= cos ([alpha]/2). This relationship was calibrated using phased A-tracts of known bend angle (18o). In the present work, the C-terminally extended hLEF-1 box induces a bend of 77/78o, calculated using the algorithm of Ferrari
et al
. (
10
), whilst the reported bend angle for mLEF-1 is ~130o (
11
), a value obtained using longer DNA fragments and the algorithm of Thompson and
Landy (
36
) for calculating the bend angle. In the structure of the mLEF1/DNA complex, the
angle is stated as ~117o (
1
). We have analysed the data of Giese
et al
. (
11
) using the algorithm of Ferrari
et al
. (
10
) and derive a bend angle of ~100o. Using a shorter DNA fragment of 100 bp containing the consensus
binding sequence AACAAAG, and calibrated using phased A-tracts, a bend angle of 102o was measured for the mLEF-1 HMG box (
19
). It was also noted that a second shifted complex of lower mobility (presumably
containing additional protein) exhibited an increased bend of 125o. Although the discrepancy in the apparent bend angles for a LEF-1 HMG box between the ~80o observed here and ~100o found elsewhere is currently unresolved (and may
in part depend on the length of the DNA fragments used), a structural basis for
an increased bend angle as a consequence of adding the 10 highly basic C-terminal amino acids to the parent box of hLEF-1 can clearly be seen in the mLEF-1/DNA complex structure (
1
).
The DNA bend angle generated by mSRY, determined using the same DNA fragments
and algorithm as for mLEF-1 (
11
), is only 85o, whilst Chow
et al
. (
19
) obtained a value of 80o and Ferrari
et al
. (
10
) measured a bend angle of 73o for hSRY. For the closely related HMG box from mouse Sox5, a bend of 74o was estimated (
13
) using the Thompson and Landy algorithm [and when re-calculated using the algorithm of Ferrari
et al
. (
10
) the value is 70o]. The bend angle generated by SRY and SRY-related HMG boxes (SOX) thus appears always to be somewhat less than
that generated by the LEF-1 HMG boxes. Comparison of the SRY and LEF-1 sequences C-terminal to the minimal HMG box region suggests that the
segment SARDNYG, which in LEF-1 comes before a run of highly basic amino acids, could be regarded as an
insertion into the SRY and SOX protein sequences. The consequence could thus be
that whilst the C-terminal extension of the LEF-1 HMG box spans the major groove, the `corresponding' run of basic
residues in SRY (RPRPK) cannot do so and this explains why larger bend angles
have consistently been reported for LEF-1 HMG boxes. This matter is unfortunately not resolved by the SRY/DNA
structure (
31
), since the DNA segment contacted by the basic C-terminal extension of mLEF-1 is absent from the structure of the SRY/DNA complex.
The increased binding affinity resulting from addition of C-terminal residues is readily understood on the basis of a structure in
which this element spans the major groove (
1
) and makes further contacts with the DNA. Cooperativity between the effects of
adding both extensions, i.e. the N-extension increases affinity only if the C-extension is present, could be simply interpreted as a consequence
of better folding of the minor wing of the protein when the domain is extended
in both directions. The proximity of the N- and C- ends of the minimal box may thus mean that mutual interaction of
the extensions is important. Alternatively, the cooperativity between the
extensions may be indirect and mediated by their binding to the DNA. Thus only
when the DNA is appropriately distorted by the binding of the C-terminal extension across the major groove is an appropriate site created
for effective binding of the N-terminal extension. Such details may be resolved from the structure of a
complex containing both extensions of the LEF-1 HMG box and a DNA longer than 15 bp.
We acknowledge the financial support of the Wellcome Trust and the help of Dr P.
D. Cary in the purification of the recombinant proteins.
REFERENCES
Return
