Nucleic Acids Research Advance Access published online on July 24, 2008
Nucleic Acids Research, doi:10.1093/nar/gkn473
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Structural Biology |
Molecular dynamics of a
B DNA element: base flipping via cross-strand intercalative stacking in a microsecond-scale simulation
1Department of Chemistry and Biochemistry and Center for Theoretical Biological Physics, University of California, San Diego, La Jolla, CA 92093-0365 and 2Howard Hughes Medical Institute and Department of Pharmacology, University of California, San Diego; La Jolla, CA 92093-0636, USA
*To whom correspondence should be addressed: Tel: +1 434 924 7824; Fax: +1 434 924 3710; Email: cmura{at}virginia.edu
Received May 13, 2008. Revised July 3, 2008. Accepted July 7, 2008.
| ABSTRACT |
|---|
|
|
|---|
The sequence-dependent structural variability and conformational dynamics of DNA play pivotal roles in many biological milieus, such as in the site-specific binding of transcription factors to target regulatory elements. To better understand DNA structure, function, and dynamics in general, and protein···DNA recognition in the
B family of genetic regulatory elements in particular, we performed molecular dynamics simulations of a 20-bp DNA encompassing a cognate
B site recognized by the proto-oncogenic c-Rel subfamily of NF-
B transcription factors. Simulations of the
B DNA in explicit water were extended to microsecond duration, providing a broad, atomically detailed glimpse into the structural and dynamical behavior of double helical DNA over many timescales. Of particular note, novel (and structurally plausible) conformations of DNA developed only at the long times sampled in this simulation—including a peculiar state arising at
0.7 µs and characterized by cross-strand intercalative stacking of nucleotides within a longitudinally sheared base pair, followed (at
1 µs) by spontaneous base flipping of a neighboring thymine within the A-rich duplex. Results and predictions from the microsecond-scale simulation include implications for a dynamical NF-
B recognition motif, and are amenable to testing and further exploration via specific experimental approaches that are suggested herein. | INTRODUCTION |
|---|
|
|
|---|
DNA is often viewed as a relatively rigid biological macromolecule (1–3), with RNA and proteins thought of as exhibiting broader ranges of both intrinsic three-dimensional structural variability as well as dynamical flexibility. This perspective of a locally rigid, globally flexible biopolymer is consistent with the rather passive biological role of DNA as the repository of genetic information—the genome is read-out by the process of transcription. Links between structure and potential biological functions (both normal and aberrant) have been explored for conformations that deviate from the standard B-form double helix—including such varieties as multi-stranded triplexes, quadruplex structures found in telomeric G-rich tracts, cruciforms adopted by inverted repeats, hairpins and slipped structures and so on (4). However, beyond these alternative secondary structures, it is also becoming increasingly apparent that the structure and dynamics of the canonical Watson–Crick DNA double helix on a very local (base pair) level play pivotal roles in specific biological functions, such as the site-specific binding of transcription factors to target DNA elements. This idea of a functional role for sequence-specific DNA fine structure and dynamics is embodied in the concept of indirect readout (5,6), wherein features of protein···DNA recognition are dictated by subtle conformational and dynamical properties of DNA beyond the stereochemical code provided by the specific linear array of chemical functionalities that line the major and minor grooves for a given nucleotide sequence.
Despite the vast literature dedicated to DNA structural biology since the first atomic-resolution crystal structures of both left- (7) and right-handed (8) double helices, many aspects of DNA structure and dynamics remain unclear—including the intrinsic coupling between structure and conformational dynamics that is the basis of indirect readout. For instance, controversy surrounds the relative significance of extrinsic/environmental factors (such as hydration and counterion-binding) versus intrinsic factorsm (such as local base pair interactions) in mediating sequence-specific DNA fine structure, as gauged by helical axis bending, groove widths and related properties (9–12). Quantitative frameworks have been developed for the description of DNA structure in terms of the local geometry of bases, base pairs (bp), bp steps and higher-order structural units [e.g. (13)], but progress in elucidating those structural and dynamical phenomena thought to occur via transient, short-lived intermediates [such as occurs in base flipping (14,15)] remains hindered by the difficulty of using existing experimental methods to extract dynamical information at both atomic resolution and over the potentially relevant timescales (ns
ms). Emblematic of this difficulty, crystallographic models generally represent spatially and temporally averaged structures of individual molecules, the averages being taken over more than 1012 unit cells (a conservative estimate, for micrometer-sized crystals of typical cell dimensions) and time periods greater than hundreds of milliseconds (a conservative estimate, for exposure with high-brilliance synchrotron X-rays). Thus, detailed knowledge remains somewhat obscure of the factors modulating the mean structural and conformational properties of DNA, the dynamical processes mediating inter-conversions between these average conformational states, and the interactions of these (sub-)states with transcription factors, histone proteins, intercalators, groove-binding drugs and other relevant species. Computational approaches such as molecular dynamics (MD) simulation (16) provide an alternative route towards exploring biomolecular structure and dynamics in fully atomic detail, and have yielded a wealth of nucleic acid simulations over the past dozen years [reviewed in (17)].
The NF-
B transcription factor family illustrates the many potential complexities of protein···DNA recognition. This family occurs in a wide variety of eukaryotes and regulates a similarly broad array of cellular pathways, ranging from morphogenesis in insects to adaptive immunity in humans (18). The five mammalian NF-
Bs [p50, p52, p65 (RelA), c-Rel and RelB] contain an
300-residue Rel Homology Region (RHR), consisting of two immunoglobulin-like folds joined by a flexible linker. NF-
Bs associate into homo- and hetero-dimers that modulate gene expression by binding to target
B DNA-enhancer sites. The remarkably loose
B consensus sequence 5'GGGRNWYYCC3' [N = any nucleotide, R = purine, Y = pyrimidine (often Thy) and W = Ade or Thy] consists of two half sites (underlined). The 5-bp GGGRN half sites are preferentially bound by p50 and p52 subunits, while RelA, RelB and c-Rel prefer the 4-bp YYCC half sites. Thus, in addition to immense sequence variability and intrinsically different NF-
B-binding preferences for different half sites,
B elements also vary in length. Known
B sites can be grouped into 9-bp class I sites (4 + 1 + 4 arrangement, for c-Rel and RelA homodimers) and 10- or 11-bp class II sites (5 + 1 + 5, for p50 and p52 homodimers). However, even the above rules and consensus sequences are likely to be too restrictive: Some sequence-specific trends are known, but there is great degeneracy in terms of both (i) optimal DNA sequences for a given NF-
B dimer and (ii) the relative affinities of different NF-
B dimers for a given
B DNA sequence (18). Thus, an outstanding question in the NF-
B field is the detailed mechanism of indirect readout—What are the determinants of sequence-specific binding of NF-
B to target
B sites? Biophysical studies have demonstrated that NF-
B···DNA-binding affinity is largely entropically driven (19), but the issue of site specificity remains far murkier, with it now thought that the conformation and flexibility of
B DNA sequences play a critical role in the recognition of NF-
B dimers (20).
Therefore, as an initial step in elucidating protein···DNA recognition and the mechanism of indirect readout in the context of differential binding of NF-
B transcription factors to
B DNA elements, we performed MD simulations of a 20-bp
B DNA of known structure (Figure 1, S1). This DNA duplex consists of the sequence d(GGGTTTAAAGAAATTCCAGA), and encompasses a
B element (underlined) recognized by the c-Rel NF-
B homodimer and its oncogenic variant v-Rel (21). Simulations of the DNA were extended to the microsecond timescale, affording insights that would have remained undiscovered in a shorter simulation. Unanticipated
B DNA structural transitions discovered at the long times sampled in this trajectory include cross-strand intercalative stacking (XSIS) of nucleotides followed by spontaneous base flipping at a neighboring nucleotide, as well as a peculiar minor groove-bound barbed terminus. In addition to illuminating the microsecond-scale structural and dynamical behavior of this particular
B sequence, the simulation provides a broad, atomic-resolution glimpse into the dynamical properties of two turns of double helical DNA over a wide range of timescales (nine orders of magnitude). The terascale body of data presents opportunities for detailed analyses of methodological issues (such as the approximations inherent in the empirical force fields used in MD), but the present report focuses instead upon the intriguing XSIS, flipping and barbing transitions, as well as suggesting specific experimental ideas which could be used to test and further explore the simulation-based predictions. While technical issues pertaining to force field parameterization lie beyond the scope of the present work, an initial assessment of the quality of this extended trajectory was made by considering microsecond-scale DNA backbone dynamics in terms of potential BI/BII and
/
backbone substate sampling problems.
|
| MATERIALS AND METHODS |
|---|
|
|
|---|
The work proceeded in stages of (i) system selection, construction and preparation; (ii) calculation of classical MD trajectories; and (iii) data processing and analysis. These stages are described later, and further detailed in the Supplementary material. The starting structure was a 20-bp DNA encompassing the
B site of a CD28 response element, drawn from a crystal structure of this duplex bound to the (c-Rel)2 NF-
B homodimer (21). This 9-bp class II
B DNA consists of 4-bp half sites (Figure 1a, S1). Because the dynamics of this particular DNA are of interest as part of a broader NF-
B-related project, the structure of c-Rel-bound
B DNA was used directly and not rebuilt into canonical B-form. A virtue of the 20-bp duplex used in these studies (versus a shorter fragment limited to just the
B site) is that the region of primary interest—the 9-bp
B site—is embedded in two helical turns of DNA, making it less susceptible to potentially spurious end-effects. The simulation system was constructed by immersing the duplex in a truncated octahedron of explicit water (Figure 1b, S1). As described in the Supplementary material, design of the periodic boundary conditions accounted for the calculated hydrodynamic properties of this DNA fragment (Figure S1e), such that a minimal clearance of 20 Å was maintained between periodic images at all times (this distance being 35 Å for the starting configuration). The final system was prepared via (i) addition of hydrogens; (ii) successive rounds of energy minimization of protons and DNA atoms; (iii) addition of 38 Na+ counterions; (iv) placement of an initial shell of interstitial waters around the neutralized DNA·Na+ system; and (v) addition of bulk water and ions (Na+ and Cl–). The resultant (over-sized) rectangular cell was trimmed to yield the final mecon. The final 61 439-atom system consisted of the
B DNA duplex accompanied by (i) 20 030 molecules of TIP3P water (22); (ii) 38 Na+ counterions; and (iii) an additional 20 Na+ and 20 Cl– ions, providing a final [NaCl]
50 mM [chosen so as to mimic the conditions of DNA-binding experiments with this and related
B elements (23)]. As a final preparatory step, the DNA was allowed to adapt to the aqueous ionic environment (and, likewise, the solvent structure to relax around the DNA and remove unfavorable contacts) by a multi-stage protocol involving incremental relaxation of harmonic restraints on DNA atoms over a total of 10 000 cycles of potential energy minimization.
Standard methods were used to compute MD trajectories [see e.g. (24)]. Initial equilibration stages of heating and restrained dynamics (Figure S2 and the submatrix in Figure 2a) were followed by 1020 ns of unrestrained, production-level dynamics. Simulations were performed in the isobaric-isothermal ensemble (constant NPT). Pressure and temperature were maintained at 1 atm and 300 K, with temperature controlled via Langevin dynamics for all non-hydrogen atoms, and pressure regulated via a hybrid Nosé–Hoover Langevin piston (25). Long-range electrostatics were treated by the smooth particle mesh Ewald algorithm (26), with a grid density better than 1/Å in every direction. Non-bonded short-range interactions were calculated within a spherical cutoff of 10.0 Å (a smooth switching function was applied from 9
10 Å). The SHAKE algorithm (27) was used to constrain bonds between hydrogens and parent heavy atoms, enabling a 2.0 fs integration time step without compromising either bulk thermodynamic quantities (Figure S2) or structural stability of the trajectory (Figure S3). Further computational efficiency was afforded using a well-established multiple time-stepping scheme (28). Simulations were computed in 1.0-ns bins, with trajectory coordinates written every 1.0 ps. The trajectory was extended to the microsecond timescale via scripts for automatically linking time slices i
i + 1, enabling indefinite propagation of the simulation in a semi-autonomous manner on a host of available Linux clusters. Trajectories were computed using NAMD (29) and the parm94 force field. Of the many force fields commonly used in biomolecular simulations, the parm94 version of the standard Amber force field was chosen for the sake of consistency and comparability with the wealth of existing DNA simulations that used this particular parameter set. Issues of force field selection and usage present a moving target, due to continuous developments in this field, and it may be argued that extensive conformational sampling (such as is achieved on the microsecond timescale) overcomes potential artifacts that may arise in shorter simulations—i.e. sampling and artifact are somewhat orthogonal and counterbalanced effects, and what may appear as artifact at short (ns) times is effectively washed-out in the limit of longer (µs) timescales. Further details pertaining to force field selection, analysis steps, and issues specific to a simulation of this length are provided in the Supplementary material.
|
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
The results are presented in five main areas: microsecond-scale trajectory stability, the cross-strand intercalative stacking transition, a base flipping event, a barbed terminus, and trajectory validation in terms of microsecond-scale sampling of DNA backbone substates. As with any simulation-based work, our computational findings could be viewed as more suggestive than conclusive, and can be considered as a way to generate atomically detailed hypotheses about the dynamical mechanism underlying a complex biomolecular transition (in this case, XSIS and flipping). Thus, explicit connections to experiment are drawn in two distinct ways (primarily towards the end of each subsection): (i) Existing data which bear upon particular results are interleaved into the discussions of those results, as are (ii) Specific proposals for experiments that might be suitable in testing our predictions.
Trajectory stability and conformational variability on the microsecond timescale
Simulations of the
B DNA element (Figure 1, S1) were extended past one µs. This lengthy timescale was achieved using an optimized code exhibiting particularly efficient parallel scaling [NAMD; ref. (29)], in conjunction with optimal hardware/network architectures and scripts designed to propagate simulations in a minimally supervised and essentially uninterruptible manner. Trajectories were thereby calculated over extensive periods of wall clock time (nearly 1 year), yielding roughly one terabyte of DNA simulation data for this >60 000-atom system. Persistence of the structural integrity of the 20-bp duplex on the microsecond timescale is notable (Figure S3), as MD simulations are typically performed for DNA fragments shorter than the two helical turns reported here (e.g. dodecamers), and are generally limited to shorter durations (e.g. <20–30 ns); exceptions are recent studies of counterion-binding in a 60-ns simulation by Beveridge and co-workers (12), two 50-ns trajectories of a different dodecamer by Várnai and Zakrzewska (30), and the recent microsecond-scale Dickerson dodecamer trajectories used in force field parameterization studies (31). In terms of simulation reliability and ion-related force field artifacts, it should be noted that the recently studied problem of KCl aggregation as a result of possibly imbalanced Lennard–Jones parameters (32) was not found to be an issue in the simulations reported here; this is not an unexpected result, given the comparatively low (
50 mM) NaCl concentration in this
B DNA system.
The many approximations and assumptions which underlie empirical force fields for biomolecular simulations—simplified potential energy functional forms, parameterizations against short simulations, neglect of atomic polarizability, etc.—motivated careful monitoring of this unusually long
B simulation, in terms of both thermodynamic properties (Figure S2) and structural parameters such as coordinate root-mean-square deviations (RMSD; Figure S3). Successful equilibration and stabilization is evident over both short (ns) and long (µs) times. Although RMSD is a less than ideal metric of DNA structural similarity, comparisons of each trajectory snapshot to canonical A- and B-form DNA suggest the presence of relatively long-lived (>5-ns) states that are closer in structure to A- than B-DNA (Figure S3f, RMSDB > RMSDA grey-shaded areas at
0.7 µs). An RMSD measure modified by a multiplicative term linearly scaled by the distance of an atom to the center of the DNA (Figure S3) confirms the expected result that the duplex termini are the most dynamical regions (i.e. account for the bulk of the RMSD).
Fine-grained differences between DNA conformations over various time spans of the trajectory were assessed by pairwise RMSD matrices, computed over periods ranging from ns
µs. As shown in Figure 2 and S4, short-time (ns-scale) matrices capture those features of the trajectory which may be expected, such as the restrained dynamics phase of the equilibration period (first ns) corresponding to very low RMSD values and differing significantly from the remainder of the trajectory (Figure 2a). The long-time (µs-scale) matrix represents >1012 pairwise comparisons (Figure 2b); significant off-diagonal basins of increased structural similarity/dissimilarity in this matrix illustrate that the notion of an average DNA structure is not an accurate representation of the true ensemble of thermally accessible conformational states in the microsecond-scale dynamics of DNA. [A similar point has been made by Beveridge et al. (33).] Patterns of intense variability in the microsecond matrix (e.g. the striping at long times in Figure 2b) show that much of the structural variability arises only beyond
0.6–0.7 µs. These patterns also served as initial hallmarks of the large-scale structural transitions which were subsequently identified as a base flipping event (Figure 2b, orange arrow) and a cross-strand intercalative stacking transition (Figure 2b, yellow arrow).
Cross-strand intercalative stacking (XSIS)
An intriguing example of DNA structural plasticity and conformational polymorphism that emerged only at long times involves disruption of a Watson–Crick base pair via cross-strand intercalative stacking (XSIS) of the bases in the previously intact pair. This atypical XSIS state develops into a stable, persistent form between
0.7 and 0.75 µs, and occurs at the (A·T)12 bp in an A/T-rich region near the center of the
B recognition element (Figure 1a). The aforementioned patterns of variability in pairwise RMSDs (Figure 2, S4) suggested the large-scale structural perturbations accompanying the XSIS transition. Visual analysis of the trajectory (Supplementary Movie A), as well as calculation of bundles of
B DNA conformers and corresponding averaged structures over discrete time windows (Figure 3), revealed the cross-strand base stacking that characterizes the XSIS state. This transition involves essentially complete abrogation of the (A·T)12 pair, such that the constituent bases (Ade1,12 and Thy2,9) longitudinally shear apart and assume a nearly coaxial rather than coplanar arrangement—i.e., they become stacked upon one another (Figure 4c–e), with the A1,12 staggered towards the 5' direction and the cross-strand partner T2,9 translated towards the 3' direction (with respect to the 5'
3' path of the parent strand along the global helical axis). Though fundamentally different from the longitudinal breathing mechanism proposed by Harvey over 20 years ago for B
Z DNA conversion (34), the XSIS transition is similar in spirit, insofar as stretching of the local backbone along the helical axis creates free space to accommodate an additional object (the new object being a longitudinally sheared base in XSIS, versus a propagating cavity which accommodates intact base pairs as they flip their
/β ring faces in the Harvey model). There is also a degree of structural similarity between XSIS and the pattern of inter-strand interactions arising during simulations of duplexes subject to an applied tension, leading to a stretched S-form DNA (35); however, any relationship between S-DNA and XSIS is unclear (i.e. the similarity may be a purely geometric result arising upon lengthening of any double helical arrangement of planar, interacting groups).
|
|
The XSIS transition can be monitored by both geometric and physicochemical parameters. Among the standard rigid-body parameters describing local bp and bp step geometry, the intra-bp Stagger (Sz) is particularly suitable for capturing the XSIS process (Figure 4a, S6). Likewise, the cross-strand mutual base overlap area described in Figure 5 [and ref. (36)] measures the degree of inter-strand stacking of the bases participating in XSIS, and can be seen to be a highly sensitive gauge of XSIS-like transitions (including the possibility of discerning asymmetric behavior of the two 5'- and 3'-shearing bases). Disruption of the local helical stack due to XSIS is accompanied by compensating (inversely correlated) changes in the bp and bp step parameters of neighboring nucleotides (Figure 4a, S6). Although the perturbative effect of this transition does not propagate very far along either direction of the duplex in terms of these geometric parameters (Figure S6), the timing of the XSIS transition nearly coincides with formation of a barbed terminus (described later), suggesting some degree of dynamical coupling between sites >10 bp apart in sequence (Figure 1a). As may be expected from the geometry of the double helix, local groove width is one of the strongest structural correlates of XSIS (Figure 4b): The transition coincides with perturbation of the proximal minor groove, wherein an initially widened groove over the
0.6
0.74-µs time span (relative to the
5.8 Å width in canonical B-form DNA and the even narrower width characteristic of A-tracts) rapidly constricts in the 5' direction. These structural features spur great interest in the possible roles of condensed counterions in triggering and/or modulating XSIS-like transitions.
|
An initial assessment of potential coupling between the counterion atmosphere and structural transitions such as XSIS was made by calculating number densities of Na+ ions within 15 Å of the
B DNA over the course of the trajectory, with populations being averaged over 1-, 10- and 100-ns windows centered on the early (Figure 4c), middle (Figure 4e) and extreme (Figure 4f) states of XSIS. These results (Figured 4d and unpublished data) are consistent with known structural trends—e.g. preferential localization of Na+ density near more strongly electronegative regions of nucleoside bases (Ade N6/7, Thy O2, etc.). Consistent with many X-ray, NMR and MD studies indicating cation-dependent minor groove narrowing (9,11,12), the severely constricted minor groove near the XSIS site (Figure 4c and e) exhibits the highest amplitude peak in the 10-ns window-averaged Na+ density map (Figure 4d, right panel). Nevertheless, the detailed linkage between groove-localized Na+ ions and structural transitions such as XSIS remains unclear; given the lengthy timescales over which the XSIS and flipping events evolve (>100 ns from initial onset to final resolution), it is likely that the local ionic environment (relaxing on the ns timescale for monovalent cations) responds to (rather than drives) the large-scale structural transitions of XSIS and flipping.
The XSIS transition and its potential role in
B DNA dynamics is a novel finding attributable to the microsecond-scale sampling of our trajectory, and it should be emphasized that this simulation-based result is more predictive than conclusive. Solution NMR studies of other
B DNA elements have been pioneered by Hartmann and colleagues (37); however, to our knowledge, MD simulations of a
B DNA have not been performed. Therefore, prior knowledge of what may be anticipated in the microsecond-scale behavior of this
B DNA is limited, and no a priori assumptions were imposed with regards to the computed behavior of this DNA fragment. Simulation caveats notwithstanding, it should be noted that numerous experimental data from both NMR and crystallography support or are consistent with an XSIS-like transition. Observations of severe groove-bending immediately preceding XSIS (e.g. Figure 4f) are consistent with NMR data showing a pronounced (
20°) bend towards the major groove in a related
B DNA (37). Also, several lines of crystallographic evidence are consistent with the XSIS phenomenon. Early studies of poly(A) and (CA)n tracts by Klug, Moras and others (38,39) have revealed a zipper pattern of bifurcated hydrogen bonds along the axial stack. These favorable i1···j2 and j1···i2 hydrogen bond interactions (in the notation of Figure 5) recapitulate the edge/edge interactions which occur between bases in both the initial and final stages of XSIS: initial, as aromatic base planes begin shearing to the early state illustrated in Figure 4d; and final, as the XSIS thymine (T2,9) begins hydrogen bonding to the wrong cross-strand adenine (A1,13) and eventually resolving the XSIS state by leading to flipping of its T2,8 neighbor. Independent structural work on G·A mismatches found a similar pattern of (i/j)1···(j/i)2 cross-strand bifurcated hydrogen bonds (40), suggesting such cross-strand interactions as a fundamental feature of DNA deformability. Perhaps most relevantly, Reid and coworkers have amassed a large body of NMR results showing that flanking G·A mismatches [such as occur in human centromeric (GGA)2 segments] can induce the shearing of intervening bases into a GA-bracketed G-stack similar to the XSIS state reported here [see e.g. (41,42)]. These experimental data concur with our prediction of a microsecond-scale XSIS-like transition at the junction of
B half sites.
Base flipping at the junction of
B half sites
The most intriguing and unanticipated result to emerge in the long-time dynamics of
B DNA is a spontaneous base flipping event. Because of its high activation barrier, such a transition is considered a rare event, thought to occur only infrequently in the equilibrium dynamics of a DNA duplex at room temperature [reviewed in (43)]. Indeed, flipping of a thymine base at the junction of
B half sites (Figure 1a) arose only near the end of the microsecond-long
B trajectory (Figures 2b, 3e and f), at
950 ns as measured by several geometric criteria (Figure 6, S6; see later). It is interesting that flipping occurred at (A·T)13, immediately adjacent to the disrupted (A·T)12 XSIS site. The thymine base that is the focal point of this transition (T2,8) can be seen to twist entirely out of the double helical stack via the major groove (Supplementary Movie B), thereby eliminating the (A·T)13 Watson–Crick bp and yielding an apyrimidinic lesion (Figure 3e, f and 6b). Concomitant closure of the newly formed lesion by pairing of the 3' neighboring thymine (T2,9; of the XSIS pair) to the orphan adenine (A1,13) suggests the model of XSIS-facilitated flipping presented later.
|
The flipping transition can be monitored by the time-evolution of (i) local bp and bp step parameters; (ii) correlated changes in local helical geometry (e.g. minor groove structure); and (iii) physicochemical quantities such as base overlap areas and the solvent-accessible surface area of the flipping base. Of the standard parameters used to describe bp and bp step geometry (13), the Opening (
) angle is a useful descriptor of the flipping transition (Figure 6a). [Note that
is not the same as the angular parameters developed as reaction coordinates in landmark studies of forced flipping by the groups of Lavery [an inclination-independent construction accounting for local helical geometry; see (44)] and MacKerell [a center-of-mass pseudodihedral; see (45)]. The distance between complementary bases is also a useful descriptor of the flipping transition (dCOG, Figure 6b), with the advantage that it is independent of the relative angular orientation between base-centered reference frames. As is the case with XSIS, significant redundancy and correlation exists within two distinct types of data: (i) The time-evolution of the various other rigid-body bp and bp step parameters are naturally coupled to
and also serve (to varying degrees) as reporters of the flipping event; for example, the translational Slide (Dy) and angular Tilt (
) of a bp step are rather sensitive to flipping, whereas the Roll (
) is less so (Figure S6). (ii) Time series of a given parameter across the different base pairs and base pair steps near the XSIS (A·T)12 and flipping (A·T)13 sites are correlated [see, for instance, the flanking (A·T)11 bp lying 5' to the XSIS site; Figure 4–6
750 ns) marks the onset of XSIS; the subsequent flipping event is evidenced by a further (smaller-scale) perturbation in the 3' groove widths (Figure 4b, blue and green traces at
950 ns).
Comparison of the microsecond-scale
B DNA flipping event to the vast literature on both experimental (15) and computational (43) studies of base flipping is hampered by the single observation of this transition, as well as the fact that, to our knowledge, this is the first suggestion for a possible role for base flipping in
B DNA dynamics (i.e.
B flipping data do not exist). Much of the existing flipping work focuses upon protein-facilitated (activated) flipping, rather than the spontaneous (passive) flipping reported here (15). However, numerous experiments have established that spontaneous DNA and RNA base flipping occurs, the methods ranging from chemical approaches [such as trapping a transiently flipped base in a macrocyclic host molecule (46)] to NMR-based measurements of imino proton exchange (47,48). On a related note, Dickerson and coworkers discovered a novel instance of intermolecular intercalation in the crystal structure of a hybrid DNA/RNA duplex (49), an implication being that such base pair swapping occurs via spontaneous base flipping. Most notably, very recent crystallographic work has revealed a base extruded from the stack of a related HIV-1
B DNA element, corroborating our prediction that base flipping may occur in
B DNA (50).
Discovery of
B DNA base flipping in the microsecond regime is a fortuitous result, as NMR studies of other DNA sequences find that this process occurs on a longer timescale (ms), inaccessible to equilibrium MD simulation. Nonetheless, that the base which flips is a thymine is consistent with the relative pyrimidine versus purine energy barrier, as well as with NMR exchange data revealing that (i) the lifetime of an A·T bp is generally shorter than G·C pairs (47) and (ii) spontaneous A·T opening is the basis for recognition of extrahelical bases (Ura/Thy) in the uracil glycosylase DNA repair system (51). (As discussed later, the junction of half sites in
B elements is almost always an A·T bp.) Also, with regards to characteristic timescales for flipping—and the base pair opening or breathing which necessarily precedes it—previous simulation work has found spontaneous breathing on the ns timescale for a difluorotoluene-substituted A·F pair (52). Flipping of T2,8 via the major groove is consistent with both experimental studies (53) and recent theoretical work (45,54,55) on flipping pathways. Strong coupling between base flipping and helical axis bending was found in early theoretical work (56) as well as more recent computational studies (55). Consistent with those findings and the aforementioned NMR studies (37), the
B DNA exhibits significant helical axis bending in a most extreme state of XSIS preceding the base flipping event (Figure 4f).
Finally, we note that the
B flipping reported here is not necessarily inconsistent with NMR-based findings of generally slower A·T opening in A-tracts (47). As with any computational or experimental approach, the NMR exchange methodology rests upon various assumptions and approximations. In particular, known limitations and caveats of this technique include (i) assumptions of two-state behavior (entirely open/closed bps); (ii) ambiguity of the actual physical process being detected (accessibility of exchanged protons versus complete base extrusion); (iii) the neglected structural and dynamical influence of ions (such as the ammonia exchange catalyst used in these studies) on local DNA structure (57); and (iv) intrinsic detection limits (ms timescales being the lower bound). Similar NMR studies have shown that A·T opening strongly depends on sequence context, and is also sensitive to oligonucleotide length (58). Most significantly, the structural basis of greater A·T lifetimes in A-tracts is assumed to stem from a presumably narrower (and more rigid) minor groove in these tracts [(43) and references therein]. However, the minor groove of the
B element does not exhibit this behavior immediately prior to the XSIS
flipping cascade (Figures 4b and S6), and therefore does not adhere to this assumption. It should be reiterated that ultimate assessment of predicted
B base flipping will require future experimental studies.
XSIS-facilitated flipping?
XSIS and base flipping appear to be intimately linked. Coupling between XSIS at (A·T)12 and base flipping at the 3' neighboring (A·T)13 is reflected in the time-evolution of numerous structural parameters between
750 ns (onset of XSIS) and 950 ns (onset of flipping). This particular 200-ns period is immediately preceded by perturbation of local helical structure, as quantified by, e.g. minor groove widths in the vicinity of these bp steps. We suggest that cross-strand intercalative stacking may facilitate passive, spontaneous base flipping—i.e. flipping in the absence of proteins that can actively extrude, bind and stabilize flipped bases [e.g. DNA methyltransferases (15,59) and glycosylase repair enzymes (60)]. Such a mechanism may be especially true in the case of the particular A/T-rich region at the center of the
B DNA (Figure 1), as this segment is highly degenerate, of low-sequence complexity, and is therefore amenable to strand-slippage deformations (61,62). In this model, the central Ade (n) in an ···AAnA··· tract undergoes a thermally induced, transient breathing event (52,63), staggers into an XSIS-like state (quantified by Sz), and then, in re-annealing to the complementary strand by re-forming the A·T pair, does so with a neighboring (n ± 1) thymine instead of the original cross-strand partner. Evidence for such a mechanism is provided by static snapshots (Figures 4 and 6
) and averaged structures (Figures 3 and S4), and most clearly by visual inspection of the process as it occurs (Supplementary Movies A, B, D). Note that an A-rich region of the low-complexity ···AAA··· form found at the junction of
B half sites (rather than a different permutation across the strands, such as ···ATAT···) is perhaps the ideal sequence on which such a mechanism could act, as it would be more amenable to such n
n ± 1 slippage than other configurations of base pairs.
Evaluation of our model for XSIS-facilitated flipping in DNA (
B or otherwise) ultimately requires experimental data. Specific studies using experimental and/or computational techniques can be envisaged as being particularly well-suited for exploring the XSIS transition, the base flipping event, and the key coupling between these events that we posit as a mechanism for spontaneous base flipping in A-rich tracts. Flipping in the long-time dynamics of this
B DNA could be studied via the chemical (flipped base trapping) and biophysical (proton exchange) methods outlined earlier, as well as a recently developed selective, non-covalent base flipping assay (64). Simulation-derived dynamics of this
B element also could be tested in terms of agreement (on the
50 ns timescale) with experiments using the time-resolved Stokes shift spectroscopic method that has been fruitfully applied to DNA by Berg and coworkers (65). Perhaps most compelling, the XSIS transition (and coupled flipping) could be explored by assaying the charge transfer properties of DNA duplexes containing this and related
B sequences. Conformationally gated charge transfer through DNA has been established by Barton and colleagues as a sensitive gauge of local DNA structure [recently reviewed in (66)], and the sequence-sensitivity of this phenomena has been demonstrated (67). Surveying the anticipated disruptive effect of an XSIS-like state (Figure 4e and f) upon DNA charge transfer efficiency in multiple sequence contexts would serve as a test of the predictive results from our simulation, and the data from such studies would help support or refute a mechanistic model for spontaneous base flipping via XSIS-facilitated base pair opening.
A minor groove-bound barbed structure
DNA oligonucleotide structures in which a terminal nucleoside loops back upon its parent strand to form a stable, well-defined, yet alternative (non-canonical) structure are uncommon: Formation of DNA secondary and tertiary structure is a highly cooperative process wherein complementary strands reproducibly zip-up into the same configuration of energetically stable Watson–Crick base pairs (3). Thus, a
B DNA conformation in which a 3' terminal nucleotide peels away from the helical stack and forms favorable (hydrogen bonded) interactions with the proximal minor groove of its parent strand was unexpected (Figures 3c–f, 7 and S7). It is intriguing that this barbing occurs at the (G
C)1 rather than the (A = T)20 terminus of the non-palindromic
B DNA duplex (Figure 7), given that solvent-exposed triply hydrogen-bonded G·C pairs are more stable than A·T pairs. As a positive control on the dynamical behavior of the
B DNA trajectory with respect to the termini, the large-scale pattern of fraying at the (A·T)20 terminus (Figure S7) and the train of transient fraying events at (G·C)1 prior to barbing (Figure 7) are consistent with the thermally induced terminal fraying previously observed in nanosecond-scale MD simulations of nucleic acids (68). Neglecting intermolecular terminus···groove contacts in the lattices of some DNA crystals, the most structurally similar examples to barbing of which we are aware are the minor groove association of extra-helical cytosines in the solution structure of a crosslinked DNA [(69); the Cyt base is not near a terminus], and the G1 conformation found in simulations of a DNA octamer containing an adenine bulge [the bulged adenine occasionally makes edge···minor groove contacts, (70)].
|
In addition to being structurally (Figure 7, inset) and dynamically distinct from the known stochastic fraying of DNA termini, the
B DNA barbed terminus emerges only at very long times (appearing at
750 ns, it is likely to be coupled to XSIS), and can stably persist for periods on the order of 100 ns. The fact that the barbed state reproducibly and independently arises over the course of the same trajectory (multiple
7.5-Å plateaus in the blue trace of Figure 7) bolsters its potential significance, although the biological significance of this DNA conformation is inherently limited by it being an end effect. Nonetheless, a barbed-like structure may be relevant from other perspectives, including the design of minor groove-binding agents (71), and in light of the numerous biochemical assays that rely upon the use of oligonucleotides not unlike the
B DNA element studied here. The barbed
B terminus suggests that addition of even up to three G·C pairs (Figure 1) to reinforce a presumably weak terminus does not suffice to prevent such end-effects from occurring on the microsecond timescale.
Validation of the trajectory and DNA backbone dynamics on the microsecond timescale
The present work on
B DNA focuses on the unexpected XSIS and base flipping events rather than a comprehensive analysis of the microsecond-scale conformational dynamics of the DNA backbone or technical aspects of MD force field development. Nevertheless, methodological studies of force field shortcomings and parameterization are of immense concern and relevance in biomolecular MD simulations, and are the subject of much active investigation [(72) and references therein]; our current simulation provides additional data for such studies, as well as a host of definite, testable predictions about a specific microsecond-scale dynamical process that may be investigated by both theoretical and experimental means. Compared to NMR and crystallographic data, existing force fields are known to yield discrepancies between trajectory-averaged values of base pair and base pair step parameters such as the Roll (
) and Twist [
; (73)]. In particular, MD-derived values of Roll are generally moderate overestimates (
MD
4–5°,
NMR
3°,
x-ray
0°), while simulations typically underestimate the average Twist (
MD
30°,
NMR/X-ray
34°). The microsecond-scale DNA trajectory reported here recapitulates these trends, both within the XSIS/flipping region (Figure S6d) and at a distal site that effectively serves as an internal control for the behavior of these parameters (Figure S6g). On a larger structural scale, two aspects of backbone dynamics identified as being particularly relevant based on nanosecond-scale MD simulations are transitions between the BI/BII substates (74) and possibly insufficient (and/or imbalanced) sampling of long-lived backbone states arising from concerted (crankshaft)
/
rotations (30,72,75). For instance, a concern in nanosecond-scale DNA simulations is that the trajectory may improperly sample (or even become indefinitely trapped in) non-canonical regions of
/
conformational space, leading to force field-dependent artifacts.
The
/
and BI/BII sampling properties of the trajectory were examined, both to gain an initial understanding of the backbone dynamics of this
B DNA on the microsecond timescale, and also to gauge possible artifactual sampling due to force field imbalances (e.g. becoming trapped at non-canonical
/
values for half of the trajectory). Two distinct (but interrelated) facets of the
/
sampling issue involve (i) the relative frequencies of visiting various
/
states (i.e. free energy differences), such as the known
/
g–/g+
(300°, 30°) energy minimum; and (ii) whether or not the backbone is able to interconvert between distinct (and possibly long-lived)
/
states, including the canonical g–/g+ ground state in addition to g+/t and g–/t local minima (the latter being a
flip away from the global minimum). With regards to (i), it is reassuring that it is the g–/g+ ground state which is occupied by the only nucleotide to not undergo a single
/
transition over the course of the entire microsecond (Figure 8a; interestingly, this is the XSIS adenine). With regards to (ii), the
B DNA trajectory shows extensive sampling of discrete basins of the
/
conformational landscape on the microsecond timescale, including reversible transitions to/from the g–/g+ energy minimum (Figure 8b). Indeed, the most densely populated region of this space is the known g–/g+ ground state, followed by the two regions (g–/t and g+/t) previously identified as local minima (75). Similarly, the BI
BII sampling behavior of the microsecond-scale trajectory is consistent with existing principles. An angular histogram of the BI/BII torsional parameter (
–
) for a nucleotide adjacent to the XSIS position reveals extensive sampling near the favorable BI state that is found in canonical B-form DNA (
–
–90°), in addition to a smaller component corresponding to the slightly less favorable BII state (
–
> 0°). The complete body of
/
and BI/BII sampling data (i.e. for all 38 nt) indicates that the
B DNA trajectory reasonably samples these backbone substates, and does not exhibit artifactual trapping in anomalous, high-energy states. Several features of
B DNA backbone dynamics are in agreement with a recent microsecond-scale dodecamer simulation (31), including the differential features of
/
and BI/BII time series sampling for A·T and G·C bps (data not shown).
|
The overall microsecond-scale structural integrity of the DNA duplex, as well as any systematic biases towards a particular helical morphology (ideal A-, B-, etc. forms) due to force field-specific limitations, were assessed by classifying the DNA conformation as either A, B or TA-like at each timepoint. Such analysis of the microsecond-scale trajectory in terms of the sampling of known DNA helical structures provides a form of simulation validation that lies beyond the local level of nucleotides and base pairs. The phosphate-based zP/zP(h) metric has been shown to be a particularly effective discriminator between the A, B and TA-like forms of DNA (76). Details of this quantity as a classifier of DNA structure are provided in Figures 9 and S8 (see the arrows denoting A-, B- and TA-like regions). Note that the most significantly perturbed helical steps in the XSIS and flipping events [(AA/TT)12 and (AT/AT)13] are also the ones which exhibit multiple discrete clusters (green patches in Figure 9b, purple in Figure S8a); nevertheless, the preponderance of the trajectory is spent in the B-like regions of zP/zP(h) space for each step of the
B duplex (data not shown). These scatter plots demonstrate that the overall conformation of the duplex—i.e., its structure at a slightly more global level than that addressed by geometric quantities such as local bp parameters and backbone torsion angles—is well-maintained throughout the microsecond-long simulation, even at highly dynamical sites that exhibit the most significant structural variability (i.e. XSIS and flipping).
|
Also in connection with potential force field-induced artifactual behavior, note that the starting DNA structure was somewhat distorted in the region distal to the
B site—i.e. it was not perfectly identical to canonical B-form DNA (see e.g. Figure 3a). However, it was found that the simulation effectively regularized this region of the DNA over a broad time span (
350–600 ns), yielding a duplex structure more similar to canonical B-form DNA than was the starting structure (see e.g. the 1
100 ns bundle in Figure 3b and the depressed plateau in the RMSD traces of Figure S3). This is a significant observation, as it effectively serves as an internal control of simulation quality and demonstrates that the trajectory was neither driven directly into a B-form helix by the force field, nor did it monotonically degrade into an artifact-ridden state capable of producing transitions such as XSIS and base flipping.
CONCLUSIONS AND IMPLICATIONS FOR NF- B···DNA RECOGNITION
|
|---|
|
|
|---|
The XSIS and base flipping events spur great interest in potential links between these transitions and the exact mechanism(s) of NF-
B···
B DNA recognition. What dictates the specificity of NF-
B···DNA binding? A partial answer comes from recognition of the modular nature of NF-
B-bound
B elements: Crystal structures [e.g. (21)] show that one NF-
B monomer typically forms most of the contacts to one
B half site (e.g. the 5' AGAA in Figure 1a), while the other subunit primarily contacts the other half site. This implies a model of cognate site recognition (21), wherein the
B DNA specificity of a given NF-
B dimer is a composite of individual half-site preferences. However, numerous exceptions to this model exist (the consensus sequence is weak), and NF-
B is somewhat atypical in its interactions with DNA: (i) DNA-contacting side chains reside in the loops of NF-
B (rather than secondary structure elements); (ii) many DNA contacts are mediated by water molecules [rather than direct side chain···base interactions; see e.g. (77)]; and (iii) a large fraction of DNA contacts are to the

t) and the min/max RMSD values used for linearly scaled coloring of matrix elements (color bars at right). Features of the heating and equilibration strategy are evident in the 10 ns matrix, including the imposition of restraints over the first
400 ps (dark region in a), as well as subsequent drift towards a stable, equilibrated structure (intense yellow stripe); the submatrix corresponding to this initial 1-ns equilibration period is delimited by green lines.


–


