Molecular modelling of (A
4
T
4
NN)
n
and (T
4
A
4
NN)
n
: sequence elements responsible for curvature
Molecular modelling of (A 4 T 4 NN) n and (T 4 A 4 NN) n : sequence elements responsible for curvature
Sanjay R.
Sanghani
,
Krystyna
Zakrzewska
,
Stephen C.
Harvey
1
and
Richard
Lavery*
Laboratoire de Biochimie Théorique, CNRS URA 77, Institut de Biologie Physico-Chimique, 13, Rue Pierre et Marie Curie,
Paris
75005,
France
and
1
Department of Biochemistry and Molecular Genetics, University of Alabama at
Birmingham,
Birmingham
, AL 35294,
USA
Received February 6, 1996;
Revised and Accepted March 12, 1996
ABSTRACT
The molecular modelling program JUMNA has been used to investigate the origins
of the strikingly different curvature of the two sequences, (A
4
T
4
NN)
n
and (T
4
A
4
NN)
n
. Gel electrophoresis and cyclisation studies have shown that only the former of
these two sequences is significantly curved. By developing novel superhelical
symmetry constraints we were able to study the energetic and structural aspects
of polymeric DNA having a controlled curvature. The results obtained (which do not take into account specific hydration effects)
correlate well with the experimental data and offer a molecular level explanation of curvature. Although curvature is found to be initiated by specific
dinucleotide junctions, deformations spread to surrounding dinucleotide steps
and, moreover, sequence effects beyond the dinucleotide level are observed.
INTRODUCTION
It is widely acknowledged that the intrinsic curvature and the induced bending
of DNA play biologically significant roles (
1
,
2
). Many DNA binding proteins are known to induce bending or can recognise curved
target sequences and a number of drugs also modify curvature. The sequence-dependent curvature of DNA has received much attention over recent years,
following the initial observations that certain DNA fragments had lower
mobility on polyacrylamide gels than their lengths would suggest (
3
). Curvature due to helically phased runs of adenines (A tracts) played a
central role in these early studies (
4
,
5
), however, it is now recognised that other sequence elements can also
contribute to DNA curvature (
6
).
The unusual structure of A tracts, generally referred to as B'-DNA has been much discussed (
7
,
8
). Early explanations of sequence-induced curvature viewed curvature either as a result of kinks created at
the interfaces between the B'- and B-DNA conformations, the junction model (
9
), or as the net effect of juxtaposing dinucleotide steps with characteristic
roll and tilt angles, the wedge model (
10
). The first version of the wedge model only considered curvature induced by
roll angles, in agreement with crystallographic and modelling studies which suggested that roll was easier to induce than tilt (
11
-
13
). In order to test this hypothesis Hagerman (
14
) studied the sequences (A
4
T
4
NN)
n
and (T
4
A
4
NN)
n
, which should behave identically if curvature was only due to roll at ApA ([equivalent to] TpT) steps. In fact, only the sequence (A
4
T
4
CG)
n
displayed distinctly abnormal electrophoretic behaviour, while (T
4
A
4
CG)
n
was nearly normal. These effects were essentially unchanged when the CG `plug'
sequence was replaced by a GC step. These results led to a refinement of the
wedge model which included tilt at ApA steps.
Other data, however, suggest that A tracts do not have uniform conformations
throughout their length and that their properties can depend on neighbouring
sequences. Thus A tract properties only appear for contiguous runs of at least
four adenines (
15
) and, while adjacent AT steps do not affect such structures, TA steps disrupt
them (
16
). Hydroxyl radical cleavage patterns determined by Burkhoff and Tullius showed
a sinusoidal pattern for curved (A
4
T
4
CG)
n
, suggesting a decreasing width of the minor groove in the 5' -> 3' direction of the A tracts. In contrast, little variation in
reactivity was seen with the straight (T
4
A
4
CG)
n
sequence (
17
), although variations, and curvature, reappeared in (T
7
A
7
N
7
)
n
(
18
). Imino proton exchange studies carried out by Leroy
et al
. (
19
) showed that proton exchange within the central A tract of the bent sequence
was considerably slower than in the straight sequence. Again, the TA step
appeared to disrupt formation of the unusual A tract structure and the proton
exchange times return almost immediately to normal B-DNA values following this step. Finally, Park and Breslauer (
20
), using a combination of spectroscopic and calorimetric techniques, showed that
the (T
4
A
4
CG)
n
sequence did not undergo the pre-melting transition observed for the bent (A
4
T
4
CG)
n
sequence and for other A tracts.
Active discussion of the origins of sequence-dependent curvature continues today and has been fuelled by crystallographic results which
have uniformly seen straight A tracts and locate bending rather as roll wedges
within the intervening sequences (
21
). This viewpoint has been termed the non-A tract model. It must, however, be recalled that curvature within
crystals can be influenced by lattice packing effects (
11
) (even if crystals can indicate bendable junctions;
22
) and reduced by the solvents used for crystallisation (
23
).
In order to investigate curvature in atomic detail we have recently extended the
JUMNA program (JUnction Minimisation of Nucleic Acids) to include superhelical symmetry (
24
). This has enabled us to deform infinitely long DNA polymers with regular
repeating sequences to any desired radius of curvature. In addition, we avoid
problems related to oligomeric end effects and generate excellent possibilities to test the stability of the minima obtained. Since the
energies involved in DNA curvature are very small, this development was an
essential step for carrying out reliable simulations. We have applied this
approach to the Hagerman sequences and are able both to correlate with
available experimental data and to propose a coherent atomic level explanation
of these observations.
MATERIALS AND METHODs
The calculations presented here were performed using the JUMNA program (
25
,
26
), which models nucleic acids using a combination of helicoidal and internal
variables. Single bond torsions and valence angles are used to model the
flexibility of each nucleotide, while the nucleotides are positioned with
respect to a reference axis system using helical rotations and translations.
All bond lengths are kept fixed and the junctions between successive (3'-monophosphate) nucleotides and the closure of the sugar rings are
ensured by quadratic constraints on the C4'-O4' and O5'-C5' distances. This approach requires
roughly 10 times fewer variables than Cartesian coordinate molecular mechanics.
In order to avoid the end effects associated with studying oligomers and to
further simplify the conformational space to be searched, helical symmetry can
be imposed within JUMNA by making symmetry-related sets of variables equivalent to one another. This option was very
useful in the present studies of DNA bending, by allowing us to investigate the
properties of polymeric DNA with long repeating sequences and thus bringing us
closer to the systems studied experimentally. Inducing bending while maintaining symmetry, however, requires a change from helical to superhelical symmetry (
24
). This change implies several extensions to normal helical coordinates. Figure
1
shows the superhelical coordinate system. The DNA molecule is constrained to
follow a superhelical pathway of defined radius and pitch. The direction of the
superhelical axis is fixed, but different directions of curvature can be
investigated by rotating DNA around its axis. This variable, which disappears
in the case of linear DNA, is termed the rotational register of the molecule.
It is particularly useful for detecting the anisotropy of intrinsically curved
sequences. Complete rotation of the DNA around its axis for various radii of
curvature, which causes large cyclic changes in the conformation of each
constituent nucleotide, also serves to verify that the molecule is in a stable
minimum energy conformation.
RESULTS
The conformational energy of DNA with the repeating sequences (A
4
T
4
CG)
n
and (T
4
A
4
CG)
n
(per decamer unit) was examined as a function of radius of curvature. It can be
seen from Figure
2
that the former sequence exhibits strong curvature, with an energy minimum at a
radius of 62 Å. The latter sequence is virtually straight, showing only a very shallow minimum at a radius of ~240 Å. The corresponding net bend/decamer is 26o and 8o for A
4
T
4
CG and T
4
A
4
CG respectively, which is in good agreement with cyclisation data (
34
).
DISCUSSION
The modelling we have carried out leads to data in good accord with experimental
results. The strong curvature of the (A
4
T
4
CG)
n
sequence and the straightness of the (T
4
A
4
CG)
n
sequence, as seen by gel electrophoresis (
14
) and cyclisation studies (
34
), is reproduced. Moreover, the minor groove width variations of the optimal
conformations correlate well with trends in the hydroxyl radical cleavage data
(
17
) and even very subtle changes due to thymine -> uracil mutations have been correctly modelled.
It should be stressed that the introduction of superhelical symmetry was
essential in this respect, as it allowed curvature to be studied in a
controlled way for polymeric DNA. Since the energies involved in bending DNA
are very small, attempting to constrain curvature within nucleic acid oligomers
is very difficult and, in our experience, sensitive to both the exact
constraints employed and to end effects. By using regular DNA polymers both
these problems can be avoided. The possibility of directly imposing
superhelical geometry, previously employed only for simplified large scale DNA
models (
40
), is a consequence of the choice of variables used by JUMNA (a combination of
helical parameters for each nucleotide and dihedral and valence angles within
each nucleotide). This choice also strongly reduces the number of variables
necessary to represent DNA flexibility and thus considerably facilitates energy minimisation and adiabatic mapping studies. It should also be recalled that
correct curvature results were obtained without imposing any constraints on the
A tract structures, in contrast to earlier molecular modelling studies (
41
,
42
).
We can now consider how curvature can be interpreted in terms of the detailed
structures we have obtained. The first remark is that the overall curvature of
a sequence can be viewed largely as a sum of base pair rolls (although
secondary effects due to tilts should not be ruled out, as shown by the studies
of sequences containing uracil, and it should also be recalled that strong
rolls appear to be coupled to twist variations). The dominance of roll is in
line with the results of crystallography and other modelling studies (
11
-
13
). Having said this, can we assume that overall curvature is a sum of
dinucleotide step effects? At first sight the answer to this question might
seem to be yes, since our modelling suggests that A
4
T
4
CG is curved mainly because of a negative roll at the ApT step in helical phase
with a positive roll at CpG, while the same roll in T
4
A
4
CG is counterbalanced by a positive roll at TpA, leading to no overall
curvature. These results are in line with the model proposed by Zhurkin for
positive roll at YpR steps and negative roll at RpY steps (
12
) and with Monte Carlo calculations by the same author (
42
), which, in addition, show positive roll at CpA steps, as in our study of the A
4
T
4
GC sequence.
Our results, nevertheless, suggest that a simple dinucleotide step model is
insufficient for two reasons. First, once a DNA fragment is strongly curved,
all base pair steps appear to participate in its curvature. Even if one or more
specific steps fundamentally cause the curvature, neighbouring steps also
distort to distribute the deformation of the double helix more uniformly and
conserve good base stacking along the sequence. It is this effect, rather than
special properties of the A tract, which are behind the sigmoidal variation of
minor groove width along curved DNA and the associated variations in hydroxyl
radical cleavage (
17
). This effect is clearly visible within regular sequence DNA which is forced to
curve, as demonstrated in our previous modelling of curvature (
24
) and by hydroxyl radical cleavage data on an oligo(dG) tract within a
supercoiled plasmid (
43
). As a consequence of this distribution of deformation the rolls associated
with ApA ([equivalent to] TpT) steps within A
4
T
4
CG vary from -5 to +4o. Consequently, it seems unreliable to describe such sequences with
a single roll value. It should be noted that if the A
4
T
4
CG sequence is forced to become straight, the sinusoidal roll variation
disappears, leaving strong values only at the ApT and CpG steps (Fig.
7
).
Figure 7
.
Roll (o) variations for curved (A
4
T
4
CG)
n
(dotted line) and for the same sequence forced to become straight (solid line).
Secondly, base pair steps may adopt more than one conformation as a function of
their sequence environment. This effect, observed for a number of base pair
steps from crystallographic results, is exemplified by the GpC step in our
studies, which have a large positive roll within A
4
T
4
GC and zero roll (coupled to a 4o increase in twist) within T
4
A
4
GC. Based both on the crystallographic results (
44
) and modelling of sequence effects (
31
,
45
) it seems reasonable to suppose that many dinucleotide steps can show such
bimodal or even multimodal behaviour as a function of their sequence context.
CONCLUSIONS
We have used molecular modelling to investigate DNA curvature, using as a test
case (A
4
T
4
NN)
n
, (T
4
A
4
NN)
n
and related sequences. The ability to carry out controlled deformations on
these repeating sequences is directly linked to the introduction of
superhelical symmetry constraints into the JUMNA program and the subsequent
elimination of oligomeric end effects. The results obtained are in good
agreement with the known experimental behaviour of these sequences, reproducing
the strong curvature of (A
4
T
4
NN)
n
, measured by cyclisation experiments, the trends in minor groove widths inferred from hydroxyl radical cleavage studies and variations in gel retardation linked to the removal of thymine methyl groups. It
should also be added that despite much discussion of the role of water in DNA
curvature (
20
,
46
-
48
), these results were obtained without taking into account any solvent effects
other than the dielectric screening of electrostatic interactions.
The resulting molecular conformations have enabled us to formulate a more
detailed view of the mechanism underlying DNA curvature. While the
experimentally observed behaviour of the sequences we have studied can be
interpreted as being largely in agreement with the junction model, the wedge
model or the non-A tract model, the molecular conformations we have obtained are not fully
in accord with any single viewpoint. Consequently, it appears that further
progress in predicting sequence-dependent curvature will require taking into account both the distribution
of deformation within curved DNAs and the context-dependent changes in the conformation of dinucleotide steps which have
been described in the present theoretical studies.
ACKNOWLEDGEMENTS
The authors wish to thank the Association for International Cancer Research (St
Andrews, UK) for their generous funding of this research.
REFERENCES
1 Travers,A. (1994) DNA-Protein Interactions. Chapman and Hall, London, UK.
13 Olson,W.K., Srinivavan,A.R., Maroun,R.C., Torres,R. and Clark,W. (1989) In Wells,R.D. and Harvey,S.C. (eds), Unusual DNA Structures. Springer Verlag, New York, NY, pp. 207-224.
24 Sanghani,S.R., Zakrzewska,K. and Lavery,R. (1996) Proceedings of the 9th Conversations on Biomolecular Structure and Dynamics, in press.
25 Lavery,R. (1988) In Olson,W.K., Sarma,R.H., Sarma,M.H. and Sundaralingam,M. (eds), Structure and Expression, Vol. 3, DNA Bending and Curvature. Adenine Press, New York, pp. 191-211.