Solution structure of the ATF-2 recognition site and its interaction with the ATF-2 peptide
Solution structure of the ATF-2 recognition site and its interaction with the ATF-2 peptideMaria Rosaria Conte*, Andrew N. Lane and Graham Bloomberg1
Division of Molecular Structure, National Institute for Medical Research, The Ridgeway, Mill Hill, London NW7 1AA, UK and 1Department of Biochemistry, School of Medicine, University Walk, University of Bristol, Bristol BS8 1TD, UK
Received July 7, 1997;Revised and Accepted August 12, 1997
PDB no. BNL-9505
ABSTRACT
The effect of leucine zipper proteins binding to the DNA recognition site is controversial. Results from crystallography, gel and solution methods have led to opposite conclusions about the conformation of the DNA in the complex. The role of the DNA binding site in the recognition process and in the gene induction mediated by transcription factors needs to be investigated further. In this article the self-complementary 16 bp oligodeoxynucleotide (CATGTGACGTCACATG)2, which contains the cAMP response element recognised by numerous transcription factors of the leucine zipper family, has been examined free from proteins and in its interaction with the mammalian activating transcription factor 2. The recognition process has been investigated by circular dichroism analysis, which has revealed conformational changes in both DNA and protein upon binding. The solution structure of the 16mer, important in order to define the effects induced by binding of leucine zipper proteins and the intrisic bending properties of DNA, has been determined from NMR data using direct refinement against NOE intensities, analysis of scalar coupling constants and restrained molecular dynamics calculations. Final structures starting from the A and B forms of DNA agreed to a pairwise root mean square deviation (r.m.s.d.) of 1.04 ± 0.3 Å (0.7 ± 0.2 Å to the average) for all atoms. The terminal base pairs were less well determined, and the pairwise deviation of the 12 core bp was 0.83 ± 0.27 Å (0.55 ± 0.19 Å to the average). The final structures are within the B-family with an average helical twist of 36 ± 2o. No significant intrinsic DNA bend is shown in the activating transcription factor regulatory site. However, there are substantial deviations from the canonical B-DNA (r.m.s.d. = 3.6 Å) in the core of the molecule, associated with relatively large base inclinations.
INTRODUCTION
The prokaryotic and eukaryotic gene induction by transcription factors is accomplished by the binding of specific transcription activators to a cis-regulatory element which is communicated to the basal transcription machinery at the start-site of transcription (1 ). In this picture the DNA is distorted via protein-induced bends, enabling the bringing together of the distantly positioned regulatory element and the basal transcription machinery. Testing of this general hypothesis has in some instances demonstrated that the specific function of protein-induced bends in DNA can be subserved by heterologous protein-induced bends or by intrinsic (sequence-directed) elements of DNA curvature (2 ).
Protein-induced bending of DNA has been investigated for numerous transcription factors including members of the leucine zipper protein family. Whilst it has been demonstrated that some transcriptional regulators, such as the Escherichia coli catabolite activator protein (CAP) (3 ,4 ) and high mobility group (HMG) box proteins (5 ,6 ), are able to bend their DNA targets, there exists some uncertainty whether the leucine zipper trancription factors induce different DNA bends and orientation (see below).
As far as leucine zipper/DNA complexes are concerned, X-ray structures of GCN4 bound to two different recognition sites [AP-1 and ATF (activating transcription factor)/CREB] (7 ,8 ) and the Fos-Jun heterodimer bound to the AP-1 site (9 ) indicate that the protein conformation changes on binding whereas the DNA conformation is essentially in an unkinked B conformation. This is in agreement with the data on the Fos-Jun heterodimer complexed by the AP-1 target site obtained by a combination of gel and solution methods (10 ) but appears to contradict some gel electrophoretic phasing analysis results which suggest that the DNA (AP-1 site) is bent in the complex with Fos-Jun heterodimer and Jun homodimer (11 ,12 ). On the other end, the anomalous electrophoretic mobility results observed with GCN4 (13 ) and with Myc/Max (14 ) were interpreted as an effect of the shape of the complex, a consequence of the leucine zipper protein motif rather than DNA bending, in agreement with the X-ray results. It appears that the source of disagreement involves the inconsistencies and disparity of the different approaches used, which include circular permutation assay, phasing analysis and DNA cyclization assays (15 ).
In order to understand the mechanism of leucine zipper/DNA recognition process, we have used different solution methods [gel electrophoresis, circular dichroism (CD), NMR and molecular modelling; see below], to analyse the interaction and the conformations of both the DNA and protein components so that a proper comparison of the effects of forming the specific complex can be determined.
The octanucleotide 5'-TGACGTCA-3', identified as a cAMP response element (CRE) is an enhancer that responds to increases in the intracellular cAMP concentration (16 ,17 ). The highly conserved CRE element has been found in the transcriptional regulatory regions of a large number of eukaryotic genes, such as those of somatostatin, fibronectin, human vasoactive intestinal polypeptide (VIP), human chorionic gonadotropin [alpha]-subunit ([alpha]-hCG) tyrosine hydroxylase, human proenkephalin, P-enolpyruvate carboxykinase and human growth hormone (HGh) (18 ) (for reviews see 19 ,20 ). It has also been shown to confer responsiveness to E1a in several adenovirus genes by interaction with ATF-2 (21 ,22 ).
In this article the hexadecamer d(CATGTGACGTCACATG)2,containing the consensus CRE core sequence and flanking regions for ATF-2 (23 ), has been investigated in the interaction with the basic region-leucine zipper ATF-2 homodimer by CD. Preliminary results (see below) showed that in addition to the expected increase in helical content of the peptide, there is also a change in the near UV CD of the DNA on forming a specific complex, as has been observed for the TPA Response Element (TRE) and CRE sequences interacting with the Jun homodimer and the Fos/Jun heterodimer (24 ). Since the DNA appears to undergo conformational changes upon the binding of ATF-2, it is essential to determine the structure and the conformation of the DNA in solution.
In the present article the solution structure of ATF binding site has been extensively investigated by NMR and molecular modelling. It is noteworthy to say that no high resolution structure of any leucine zipper protein DNA binding sites has been reported in literature. There are two reasons for examining the free DNA in detail. First, to ascertain whether the conformation changes significantly on forming the complex, and/or which distortions occur, since for a complete analysis, the conformational change of both protein and DNA are needed to assess the nature of the DNA/protein contacts and provide a free energy estimation of the binding process. The second involves the question that the free DNA could be intrinsically bent, as shown by DNA cyclization analysis of ATF binding site (10 ). In the latter case the leucine zipper transcription factors might require this structural specificity for DNA recognition and contact.
On the other hand, if the DNA binding site is not bent in solution as well as in the complex with leucine zipper proteins, it means that the leucine zipper transcription factors do not induce different DNA bend and orientation, but the intrinsic DNA properties, together with the contributions of other factors involved, may direct the assembly of the initiation complexes with distinct structural and functional properties.
MATERIALS AND METHODS
Materials
The HPLC-purified 16mer d(CATGTGACGTCACATG)2 was purchased from Oswel Research Products Ltd (Southampton, UK), and used without further purification. The DNA was dissolved in aqueous buffer (5 mM sodium phosphate, 100 mM KCl, 0.01 mM EDTA, pH 7) and annealed by slow cooling from 80oC. The DNA was dialysed against the same buffer for 2 days and lyophilised. The sample was redissolved in 0.6 ml of 2H2O or 0.6 ml 1H2O containing 10% 2H2O. The final concentration of DNA was 2.1 mM in strands.
The basic region-leucine zipper ATF-2 (55 amino acid peptide in the monomeric form) was synthesised by standard solid-phase methods and purified by HPLC on a reverse phase (C18) column, using a water/acetonitrile gradient (0.1% TFA). The peptide was then dialysed against buffer (200 mM KCl, 0.05 mM EDTA, pH 7) and lyophilised. The concentration of the peptide was determined from the UV absorbance at 279.6 nm, using the appropriate molar extinction coefficient for a single tryptophan residue per monomer.
Circular dichroism
The CD spectra were recorded on a JASCO Model J-600 spectropolarimeter at 25oC. All spectra were the average of five scans and were corrected for the baseline and normalised with respect to concentration (that is, for both peptide and DNA values are reported as molar ellipticities per residue). The buffer used was 5 mM sodium phosphate, 100 mM KCl, pH 7, and the cell path lengths were 1 and 10 mm.
The 16mer (2 * 10-6 M duplex) and the ATF-2 peptide (4 * 10-6 M homodimer) were scanned over the wavelength ranges of 220-330 nm (near UV) and 190-260 nm (far UV). The titration of the DNA with the peptide and that of the peptide with DNA were followed reading the spectra in the near and in the far UV, respectively.
Electrophoresis
Electrophoresis experiments were carried out using 20% polyacrylamide mini gels (Atto Corp) in Tris-Borate-EDTA pH 8 at room temperature (23oC). Gels were pre-electrophoresed for 20 min before loading. Duplex DNA (3.2 [mu]M) was dissolved in the run buffer supplemented with 25% glycerol, and mixed with increasing concentrations of peptide and incubated for 30 min before loading on to the gel. Tracking dye (bromophenol blue) was not mixed with the samples, but run separately in the outermost lanes. Gels were run at 120-200 V until the dye approached the end of the gel. Samples were stained with ethidium bromide, and visualised by fluorescence of the intercalated dye.
NMR spectroscopy
1H NMR spectra were recorded at 11.75 and 14.1 T on Varian UnityPlus and Varian Unity spectrometers, respectively. 2,2-Dimethyl-2-silapentane-5-sulphonate (DSS) was used for internal chemical shift referencing.
NOESY spectra in 2H2O were recorded at 25, 40 and 50oC using the method of States et al. (25 ), with mixing times of 30, 50, 100, 250 and 300 ms, and acquisition times of 0.06 and 0.7 s in t1 and t2 respectively. Prior to Fourier transformation, the free induction decays were zero-filled to 8192 or 16 384 points in t2 and 2048 points in t1, and multiplied by an unshifted Gaussian weighting function in both dimensions.
Driven truncated NOE experiments (26 ) in 2H2O were recorded with eight irradiation times from 30 to 500 ms at 25oC and from 50 to 600 ms at 40oC. The effective rotational correlation time of the duplex was determined from the cross-relaxation constant of each C(H6-H5) vector, at 25 and 40oC as previously described (27 -29 ).
Spectra in 1H2O were recorded at 4 and 10oC using a Watergate gradient pulse sequence (30 ) to suppress the solvent signal. NOESY spectra were acquired at 14.1 T with mixing times of 50 and 250 ms at 4oC and 75 and 300 ms at 10oC. Typical acquisition times were 0.05 and 0.4 s in t1 and t2 respectively. The data matrices were apodized using an unshifted Gaussian function and zero-filled to obtain a matrix of 16 384 or 8192 by 2048 complex points.
31P NMR spectra were recorded at 9.4 T on a Bruker AM400 spectrometer, using methylene diphosphonate as an external chemical shift reference. A proton-detected heteronuclear shift correlation spectrum was recorded at 40oC using the method described by Sklenar et al. (31 ). Limits on 1H-31P coupling constants were estimated using simulations and the observed antiphase splittings in F2.
Molecular modelling and structure refinement
Restraints. NMR cross-peak volumes were determined as previously described (32 ) and normalised to the C(H6-H5) cross peak volume. Volumes were independently confirmed using the integration routines within FELIX 2.30 (Biosym, San Diego).
Nucleotide conformations were analysed according to a two-state model i.e. P(S), [chi](S) and P(N), [chi](N) using the program NUCFIT (33 ). The intranucleotide distances were then calculated for the major conformation obtained from the best fit solutions. The dihedral angles [chi] best defined by the data were restrained to ±10o, whereas those determined by using a smaller number of intranucleotide NOEs were restrained to ±20o. Sugar conformations were restrained by H1'-H4' and H2'-H4' distances, determined from the NOE intensities or calculated from the values of P(S) obtained from the coupling constants as previously reported (34 ). In some calculations, the sugar conformation was also restrained by the backbone angle [delta] (C5'-C4'-C3'-O3') which was calculated from P (pseudorotation phase angle) and [Phi]m (maximum amplitude) (35 ). The backbone angle [gamma] was determined from measurements of [Sigma]4' in DQF-COSY spectra and the width at half-height of H4' resonances taken from NOESY cross sections (from H1' and H2''), and comparing the NOESY intensities for the base to H5'/H5'' interactions. For all residues except the terminal ones, [gamma] was restrained to 0-100o as determined from [Sigma]4' (and see below).
Heavy atoms involved in Watson-Crick base pairing were restrained according to the standard distances (2.8-3.25 Å) (36 ), on the basis of the NMR spectra of the exchangeable protons. In addition, distance constraints for AC2H-TN3H, CN4H-GN1H were also supplied based on the observation of intense NOEs for these pairs of protons [2.2-3.2 Å for AC2H-TN3H, 2.5-4.0 Å for GN1H-CN4H (2) and 3.0-4.5 Å for GN1H-CN4H (1)]. Molecular dynamics (MD) calculations. All MD calculations were carried out on a Silicon Graphics Iris IndigoII workstation, using Discover 2.9 (Biosym, San Diego). Structure calculations were carried out using the Amber force field starting from different initial coordinates for the d(CATGTGACGTCACATG)2 duplex, including standard B-DNA, standard A-DNA and coordinates generated after a short (3 ps) free dynamics run starting from B-DNA. Solvent and counterions were not explicitly considered in the calculation, but their effects were simulated by using a distance-dependent dielectric constant with [epsilon] = rij for the molecular mechanics (MM) and MD steps. Additional calculations were performed without electrostatics to evaluate the relative importance of the experimental constraints and the force field.
The protocol employed for structure refinement consisted stepwise of: (i) 1000 steps of conjugate gradient minimization, (ii) MD with heating to 300 K during 30 ps (1 fs timestep), (iii) 200 ps MD (1 fs time step) with coupling to heat bath at 300 K, sampling at 20 fs intervals and (iv) 1000 steps of conjugate gradient minimization. After repetitions of steps (i), (ii), (iii) and (iv) the refinement concluded with an rms gradient of <0.05 kcal/mol/Å. Calculations starting at 1000 K converged to the same minimum. Soft constraints were applied throughout the refinement for all nucleotides, with force constants of (40 kcal/mol/Å2 for distances and 40 kcal/mol/rad2 for torsions). When the initial coordinates of standard A-DNA were employed, the structure refinement protocol used a further 1000 steps of conjugate gradient minimization before step (i), applying constraints but not the full force field (scaling the torsion, non-bond and coulombic interaction to 0.05).
Structures were analysed using InsightII (Biosym, San Diego).
RESULTS
Interaction of the ATF-2 peptide with the target site
Circular dichroism. The CD spectra of nucleic acids provide a global overview of the nature of the base-stacking within the helix. The CD spectra of d(CATGTGACGTCACATG)2 recorded over a wavelength range of 220-330 nm is characteristic of B-DNA (Fig. 1 A) (37 ). Figure 1 A shows two points of the titration of the 16mer with the ATF-2 peptide, corresponding to 1:0.5 and 1:1 DNA (duplex)/peptide (homodimer) concentrations. The effect of the specific binding of the peptide to the central 8 bp (CRE) of the DNA is an overall change of the base-stacking pattern in the CRE (Fig. 1 A), indicated by a small shift of the DNA spectra towards lower wavelength and by an increase of the positive signal. Since the peptide has no significant CD signal in the near UV, as shown in Figure 1 A, changes in the DNA CD profile implies conformational changes of CRE upon ATF-2 binding. This behaviour is qualitatively similar to those of TRE and CRE sequences bound to the Jun homodimer and Fos/Jun heterodimer (24 ).
NMR spectroscopy
Assignment of non-exchangeable and exchangeable protons. All the non-exchangeable protons of d(CATGTGACGTCACATG)2 except H5'/H5'' were assigned at 25, 40 and 50oC following the well-determined connectivities (40 ,41 ) in NOESY (nuclear Overhauser enhancement spectroscopy), TOCSY (total correlation spectroscopy) and DQF-COSY (double quantum filter correlation spectroscopy) spectra. Figure 2 A shows a section of a typical NOESY spectrum showing the base-H1', -H5 and -H3' regions and the sequential connectivities between base protons and H1' and H3'. H2' and H2'' were distinguished both by the relative intensities of the H1'-H2' and H1'-H2'' cross peaks in short mixing time NOESY spectra (H2'' giving the more intense peak), and from the analysis of the fine structure of these peaks in DQF-COSY, as described in detail elsewhere (34 ).
DISCUSSION
The interaction of d(CATGTGACGTCACATG)2 with the leucine zipper-basic region ATF-2 homodimer has been investigated in solution by CD analysis which has shown that the recognition process implies conformational changes for both protein and DNA. As reported for other leucine zipper proteins, the basic region of the peptide folds into an [alpha]-helix upon DNA binding. The near UV CD data on DNA show a conformational change upon protein binding, revealed by an overall change of the base-stacking pattern in the CRE sequence. The identification and the analysis of the DNA conformational changes require the structural analysis of the DNA recognition site free from protein, so that a proper comparison with the DNA conformation in the complex can be made.
The final structures shown in Figure 2 B clearly demonstrate that the ATF binding site is within the B family of conformations, but differs significantly from the canonical B structure (r.m.s.d. = 3.6 Å), and also from B-DNA energy minimised without experimental constraints (r.m.s.d. = 2.9 Å). This indicates that the experimental restraints determine the structure and implies that, in order to have a detailed and accurate picture of the recognition process, as well as to ascertain the effect of the binding of leucine zipper proteins, it would not be correct to assume that the conformation of the DNA recognition site is simply standard B (canonical structure and/or energy minimized without experimental constraints).
The r.m.s.d. values of the final structures, especially when considering just the core of the molecule, indicate a reasonably well determined structure. Similar results have been obtained for smaller fragments of DNA (51 ) and it turns out that well determined structures (except for the backbone) can also be obtained for long duplexes, where a biologically interesting sequence can be studied without the influence of end-effects.
The number of independent constraints was relatively small. However, by appropriate model fitting, at least substantial parts of the molecule can be defined to high precision (i.e. nucleotide conformations) whereas, because there are relatively fewer internucleotide and cross-strand NOEs, and more parameters to determine per dinucleotide, it is hardly surprising that the backbone parameters are less precisely defined with the data that can realistically be obtained by homonuclear methods.
In these structures, we have accounted for conformational averaging in the nucleotides within the context of a simple two-state conformational equilibrium, which is supported by extensive NOE and scalar coupling data. This model is sufficient to account for all of these data, but may be a simplification (52 ). Because of the much sparser experimental data available for the internucleotide interactions, and the greater number of degrees of freedom, we have chosen to use fairly loose internucleotide constraints. This results in a relatively poor determination of the helical parameters based on experimental data alone; the nature of the forcefield parameterisation becomes relatively more important, and therefore subject to greater uncertainty. The influence of conformational averaging in the sugars on the internucleotide NOEs, independent of any averaging in the backbone conformation, has not been assessed, though a possible method for doing this has been described by James' group (49 ,52 ).
The observation that all of the sugars are predominantly `S' is different from other versions of the ATF recognition site from the E2A promoter of adenovirus (54 ,55 ). It indicates that a characteristic backbone surface different from that of a standard B-DNA is not strictly required in the protein-DNA recognition process.
Helical twists, rise, propeller twists and base inclinations were calculated for the different structures using Curves version 5.1 (56 ) (Table 3 ). On average, the helical twists were 34.5 ± 3.0o (33.4 ± 2.3o for the 12 core bp) and the rise 3.4 ± 0.4 Å, with little difference between structures starting from B or A conformations, and are typical of B-DNA. The parameters for the terminal base steps were quite variable, and reflect the lack of constraints at the ends of the duplex.Propeller twists were small and showed no particular pattern along the sequence.
. Helical parameters for d(CATGTGACGTCACATG)2 calculated as described in the text using Curves version 5.1 (56)
Step
Rise (Å)
Twist (o)
Incl (o)
Prop. twist (o)
T3
-6.9
15.5
3.11
31.4
G4
0.65
-11.4
3.29
36.9
T5
4.7
6.9
3.60
32.4
G6
10.2
-6.9
3.73
31.8
A7
13.1
-2.6
3.89
36.1
C8
17.1
0.2
3.87
30.2
G9
17.4
-1.1
3.89
35.9
T10
14.0
-2.4
3.73
32.0
C11
10.9
-7.3
3.59
32.1
A12
5.1
8.2
3.34
36.2
C13
0
-12.9
3.17
33.3
A14
-8.2
15.1
mean
3.57
33.4
6.5
0.12
sd
0.027
2.3
6.5
9
Helical parameters are averages over 10 structures and were calculated for the core 12 bp.
Helical parameters, shown in Table 3 , indicate that the ATF regulatory site is not significantly kinked in solution, is free from protein, andsuggest that leucine zipper do not require this structural specificy for DNA recognition and binding. Once the structure of the free DNA in solution is assessed, it will be possible to determine in detail the DNA confomational changes induced upon the binding of leucine zipper proteins. The results with CD showed that in addition to the expected increase in helical content of the peptide, there is also a change in the near UV CD of the DNA on forming a specific complex (see above). As observed for Jun homodimer and Fos/Jun heterodimer (24 ), the difference in the DNA CD spectra induced upon protein binding is unlikely to be linked to different bending of the DNA binding site, since CD seems to be fairly insensitive to DNA bending. Thermodynamics and structural studies on the ATF-2 and 16mer/ATF-2 complex are currently in progress (M.R.Conte, G.Bloomberg and A.N.Lane, unpublished results), in order to investigate in detail the nature of the recognition process and to define the DNA conformation in the complex. DNA bends are often manifest in the 31P NMR spectrum (57 ,58 ) or of the exchangeable protons (6 ). Preliminary 31P and 1H NMR results on the 16mer/ATF-2 complex (data not shown, M.R.Conte, G.Bloomberg and A.N.Lane, data to be published elsewhere) suggests that the DNA is not greatly distorted in the complex with ATF-2. These data would suggest that the leucine zipper proteins bind to a non-intrinsically-bent DNA binding site in a B-like conformation and induce DNA conformational changes, which appear not to be DNA bending.
In conclusion, preliminary speculations would indicate that the leucine zipper proteins are not responsible for bending the DNA in the assembly of the initiation complex.
The proton NMR assignment is available from M.R.C. or as
Supplementary Material via NAR Online.
ACKNOWLEDGEMENTS
This work was supported by the Medical Research Council of the UK, and a Wellcome Travelling Research Fellowship to M.R.C. We thank Dr S.Martin for assistance with CD studies and Dr J.Gyi for comments on the manuscript. NMR spectra were recorded at the MRC Biomedical NMR Facility.
41 Wijmenga, S.S., Mooren, M.M.W. and Hilbers, C.W. (1993) in Roberts,G.C.K. (ed.), NMR of Macromolecules. A Practical Approach. IRL Press, Oxford, Ch. 8, pp. 217-288.
42 Lefèvre, J.-F., Lane, A.N. and Jardetzky, O. (1987) Biochemistry 26, 5076-5090.MEDLINE Abstract
43 Yip, P. and Case, D.A. (1989) J. Magn. Reson. 83, 643-648.
44 Robinson, H. and Wang, A.H.-J. (1992) Biochemistry 31, 3524-3533.MEDLINE Abstract
45 Wijmenga, S.S., Heus, H.A., Werten, B., van der Marel, G.A., van Boom, J.H. and Hilbers, C.W. (1994) J. Magn. Res. 103B, 134-141.
46 Smith, S.A., Levante T.O., Meier, B.H. and Ernst, R.R. (1994) J. Magn. Res. 106A, 75-105.
47 Kim, S.-G., Lin, L.J. and Reid, B.R. (1992) Biochemistry 31, 3564-3574.MEDLINE Abstract
56 Lavery, L. and Sklenar, H (1988) J. Biomol. Struct. Dyn., 6, 63-91.
57 Beckmann, P., Martin, S.R. and Lane, A.N. (1993) Eur. Biophys. J. 21, 417-424.MEDLINE Abstract
58 Haqq, C.M., King, C.-Y., Ukiyama, E., Falsafi, S., Haqq, T.N., Donahoe, P.K. and Weiss, M.A. (1994) Science 266, 1494-1500.MEDLINE Abstract
* To whom correspondence should be addressed at present address: Department of Biochemistry, Imperial College of Science, Technology and Medicine, Exibition Road, South Kensington, London SW7 2AY, UK. Tel: +44 171 594 5315; Fax: +44 171 225 0960; Email: s.conte@ic.ac.uk