Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (275K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (13)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Soliva, R.
Right arrow Articles by Orozco, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Soliva, R.
Right arrow Articles by Orozco, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research Pages 2248-2255  


Can G-C Hoogsteen-wobble pairs contribute to the stability of d(G·C-C) triplexes?
Introduction
Materials And Methods
   Gas phase calculations
   Molecular dynamics simulations
   Free energy calculations
   Computational details
Results And Discussion
   Gas phase calculations
   Molecular dynamics simulations
   Thermodynamic integration calculations
Conclusions
Acknowledgements
References


Can G-C Hoogsteen-wobble pairs contribute to the stability of d(G·C-C) triplexes?

Can G-C Hoogsteen-wobble pairs contribute to the stability of d(G·C-C) triplexes?

Robert Soliva, F. Javier Luque1 and Modesto Orozco*

Departament de Bioquímica, Facultat de Química, Universitat de Barcelona, Martí i Franquès 1, Barcelona 08028, Spain and 1Departament de Fisicoquímica, Facultat de Farmàcia, Universitat de Barcelona, Avgda Diagonal sn, Barcelona 08028, Spain

Received March 12, 1999; Revised and Accepted April 20, 1999

ABSTRACT

Quantum mechanics, molecular dynamics and statistical mechanics methods are used to analyze the importance of neutral Hoogsteen-wobble G·C pairing in the stabilization of triple helices based on the poly-(G·C-C) trio at neutral pH and low ionic strength. In spite of the existence of a single hydrogen bond, the Hoogsteen-wobble G·C pair is found to be quite stable both in gas phase and solvated DNA. Molecular dynamics simulations of different triplexes based on the d(G·C-C) trio leads to stable structures if the neutral d(G·C-C) steps stabilized by Hoogsteen-wobble pairs are mixed with d(G·C-C+) steps. Finally, high level ab initio calculations and thermodynamic integration techniques are used to determine the relative stability of G·C wobble and G·C imino pairings. It is found that triplexes containing the imino pairing are slightly more stable structures than those with the wobble one, due mainly to a better stacking.

INTRODUCTION

DNA triple helices are structures formed by the interaction of a duplex DNA with an extra oligonucleotide strand, the so-named triplex forming oligonucleotide (TFO). The interaction of the TFO with the DNA duplex is specific due to the existence of H-bonds in the major groove between the purine strand of the DNA and the pyrimidines (pyrimidine motif) or purines (purine motif) of the TFO.

TFOs might have a large therapeutic interest related to their ability to interact with specific fragments of DNA. This opens the possibility of blocking the activity of specific genes by adding suitable oligonucleotide sequences. Several examples in the literature (1-6) demonstrate that this `anti-gene' strategy might be one of the most promising therapies in the future. It should be noted, however, that triplex-based therapies still present a series of technical problems, several of them related to the lack of detailed 3D information on these structures (1-6). Thus, even for the most studied triplexes [those based on the d(A·T-T) and d(G·C-C) trios] there are no high resolution X-ray diffraction data, and just a few NMR structures have been published (7-13). This explains the existence of debate regarding many key issues in the structure of DNA triplexes, particularly about the canonical structure of triplex DNA, and the chemical nature of d(G·C-C) trios in d(G·C-C)-based triplexes (see for instance, the different structures described in 1,7-19). The clarification of these issues is essential for the understanding of structure, flexibility and reactivity of DNA triple helices.

Inspection of standard H-bond pairings in Figure 1 demonstrates the formation of a Hoogsteen double H-bond between cytosine and guanine needs a proton between cytosine N3 and guanine N7. The most extended hypothesis assumes that cytosine [whose pKa (4.7) is greater than that of guanine] is protonated at N3, which allows the formation of a strong neutral-ion pair, where the charge is located in the cytosine (18). The existence of d(G·C-C+) trios is supported by solid experimental data, such as the strong pH dependence of d(G·C-C) based triplexes (20), and NMR spectra (9,10,12,13,16-22), where the signals corresponding to protonated cytosines are clear. On the contrary, there is also evidence favoring the existence of neutral d(G·C-C) trios, such as the surprising stability of these triplexes at neutral pH (23-25), accurate calorimetric (26) analysis showing that the number of extra protons in a d(G·C-C) based triplex is smaller than the number of d(G·C-C) trios. These results suggest that not all the d(G·C-C) trios of a d(G·C-C)-based triplex are formed with protonated cytosine, but some of them might be formed with neutral cytosines.


Figure 1. Schematic representation of d(A·T-T) and d(G·C-C)+ trios.

Inspection of Figure 2 suggests that a clear alternative to the Hoogsteen (G-C)+ is the pairing of guanine with an imino cytosine [(G-C)i pairing]. Recent molecular dynamics (MD) simulations (27) performed on the structure of triplexes containing the d(G·C-C) motif suggest that even at neutral pH and low ionic strength the d(G·C-C) trios containing protonated cytosines are in general more stable than those defined with neutral cytosines. However, the same simulations suggest that the situation might change when the d(G·C-C+) trio is surrounded by neighboring d(G·C-C+) trios [like for instance in the d(G)n sequences], since the strong electrostatic repulsion between three contiguous protonated cytosines destabilizes the triplex. In this case MD simulations and Poisson-Boltzman calculations (27) suggest that the stability of triplexes is enlarged by a suitable combination of neutral and protonated d(G·C-C) trios. Such a hypothesis has been indirectly supported by very recent NMR data (25) which demonstrated that d(G·C-C)-based triplexes are stable under conditions where not all the Hoogsteen cytosines are protonated.


Figure 2. Schematic representation of d(G·C-C)i and d(G·C-C)w trios.

Inspection of Figure 2 shows that there is an alternative possibility for a neutral Hoogsteen G·C pairing. This type of wobble interaction is stabilized by a single H-bond, and accordingly does not seem a priori very stable. However, recent high level QM calculations (28) suggested that a Hoogsteen-wobble pairing like that in Figure 2 can lead to stable pairings. This surprising finding raises the question of whether neutral d(G·C-C) trios can be stable without the existence of the rare imino tautomer of cytosine [d(G·C-C)i trios], and whether or not such trios can lead to stable triple helices.

In this paper we present a theoretical analysis of the possible role of neutral d(G·C-C) wobble trios [d(G·C-C)w] in the stabilization of d(G·C-C)-based triplexes. QM calculations were used to determine the stability of (G-C)i and (G-Cw) pairing. MD simulations were used to determine the stability of d(G·C-C)-based triplexes containing d(G·C-C)w trios. Finally MD-thermodynamic integration (MD-TI) studies were combined with high level ab initio data to determine the relative stability of triplexes containing d(G·C-C)w and d(G·C-C)i trios.

MATERIALS AND METHODS

Gas phase calculations

Dimerization energies were determined at the QM level using density functional theory (DFT) and the B3LYP functional (29-31). Geometries of the isolated bases (monomers) and Hoogsteen base pairs (dimers) were optimized at the B3LYP/6-31G(d) level, and the minima were verified by frequency analysis. Single point calculations were subsequently performed at the B3LYP/6-31+G(d,p) level. The basis set superposition error (BSSE) was corrected using the counterpoise method (32). The dimerization energy ([Delta]E) was then defined from the interaction (intermolecular stabilization energy: Eint) and distortion (energy penalty due to change in monomer geometry upon dimerization: Edist) terms, as shown in equations 1-3. In these equations the subscript refers to the chemical system (monomer or dimer), and the superscript means the basis set (isolated bases of Hoogsteen base pairs) used in the calculations. The fully optimized geometries in the dimer were used in all cases except in calculations marked with `(o)' which were done on the fully optimized geometries of each isolated monomer. Note that in fact Eint is computed at the B3LYP/6-31+G(d,p) level, while Edist is determined at the B3LYP/6-31G(d) one.

1
2
3

Dimerization enthalpies, entropies and free energies were determined after introduction of zero point, thermal and entropic corrections (T = 298 K, P = 1 atm for ideal gas conditions). Such corrections were computed following the standard thermo-dynamic equations as implemented in Gaussian-94 (33). In all cases B3LYP/6-31G(d) optimized geometries and frequencies were used.

The relative stability of the (G-C)i and (G-C)w complexes was first analyzed by comparison of the absolute energies, enthalpies and free energies of the two dimers. Energies were taken from B3LYP/6-31+G(d,p)//B3LYP/6-31G(d) calculations while zero point, thermal and entropic corrections were introduced at the B3LYP/6-31G(d) level, as explained in detail above. In order to obtain an additional estimate of the relative stability of (G-C)i and (G-C)w complexes we combined (see equation 4) the B3LYP/6-31+G(d,p)//B3LYP/6-31G(d) estimates of the energies, enthalpies and free energies of association of the two dimers with MP4/6-311++G(d,p) thermodynamic values of the C[harr]C(i) tautomeric equilibrium (34).

4

where X is a thermodynamic magnitude (energy, enthalpy or free energy), DFT means the B3LYP/6-31+G(d,p)//B3LYP/6-31G(d) calculations and MP4 the MP4/6-311++G(d,p) data.

Molecular dynamics simulations

MD simulations were performed to determine the stability of DNA triplexes containing the d(G·C-C)w trio. Two (10·10·10) DNA sequences which form MD-stable DNA triplexes (27) were studied: d(AGGGAAGGGA) and d(GAGAGAGAGA) (for the sake of clarity only the purine strand of the triplex is noted). For the first sequence d(G·C-C)w trios were introduced in positions 3 and 8 (numbering is always defined according to purine strand). For the second sequence a single d(G·C-C)w trio was introduced at position 5. All the rest of the d(G·C-C) trios were supposed to contain protonated cytosines. In all the cases the structures equilibrated from previous 1 ns MD simulations (27) were used as starting models, which were then modified manually to generate starting structures containing the d(G·C-C)w trios. These preliminary structures were neutralized by adding Na+ ions, and immersed in boxes containing 4141 TIP3P water molecules (35). These hydrated structures were subjected to our standard 10-steps equilibration process (36) which extends for >150 ps. The final equilibrated structures were then used as starting points for 1 ns of unrestrained MD simulation at constant pressure (1 atm) and temperature (298 K). Periodic boundary conditions and particle mesh Ewald [PME (37,38)] techniques were used to account properly for long-range electrostatic effects in all MD simulations. A 2 fs integration time step was used, and all bond distances were frozen using SHAKE (39). The AMBER-95 force-field (40) was used to represent the DNA, and the TIP3P model (35) was used to model water molecules. Force-field parameters for protonated and imino cytosines were taken from our previous work (27). Comparison with DFT calculations supports the appropriateness of force-field parameters (see below).

Free energy calculations

The relative stability of the Hoogsteen base pairs (G-C)i and (G-C)w can be determined after breaking the free energy of the (G-C)i->(G-C)w process in two contributions: (i) that arising from the intrinsic stability of the imino and amino tautomers of cytosine and (ii) that arising from the differential stabilization of the triplex structure in both imino and amino tautomers of cytosine. The first contribution can be taken directly from high level QM calculations (34) or experimental data (41). The second term can be interpreted as the differential binding of a TFO with amino cytosine and a TFO with imino cytosine to a DNA. This later term can be determined from the differential change in free energy due to the mutation of an imino cytosine into an amino one in both a TFO and a triplex DNA (Fig. 3).


Figure 3. Thermodynamic cycle used to compute the difference in free energy of binding between an imino and an amino cytosine. Values computed in MD-TI calculations are [Delta]G(1) and [Delta]G(2).

Mutations were performed for position 8 of the d(AGGGAAGGGA) triplex, and for a single strand oligonucleotide (the TFO) of sequence d(TCCCT). Our previous works (to be published) have shown that no dramatic differences can be expected from the use of single-strand oligonucleotides of different lengths (from 3 to 7); even the selection of a small sequence simplifies the calculation. Mutations were performed typically in the imino->amino direction, even though for the triplex we also performed one simulation in the amino->imino direction to verify the reversibility of the process.

The starting coordinates of the triplex in the TI simulations corresponded to the last structures obtained after 1 ns of MD for the d(AGGGAAGGGA) with Hoogsteen cytosine at position 8 in either amino or imino form (see above). Unfortunately, no such good models of the single-strand TFO exist, and so it was necessary to generate reliable and unbiased models using MD simulations. For this purpose, a first structure of the TFO was generated from standard B-type parameters (42) and was neutralized by adding Na+, and solvated by 824 water molecules. The temperature of the system was then increased up to 380 K during 100 ps, and the trajectory was then continued for 50 ps more at this temperature. The system was then cooled to 300 K during 100 ps, and further equilibrated during 20 ps. The final structure of this annealing process was then used as the first starting configuration for TI calculations of the TFO. The second starting configuration was obtained after repeating the annealing procedure noted above, but now using the last structure of the first annealing process as the starting configuration for a second annealing procedure. This process guarantees to start mutations from two representative configurations of a TFO in aqueous solution.

The mutation from the imino to the amino tautomers of cytosine in the TFO and triplex was performed in 21 or 41 windows. Each window consisted of 10 ps of equilibration and 10 ps of collection, leading to a total of 420 or 820 ps for each mutation. The comparison of results obtained from different simulation times (and different starting configurations) allowed us to check the ability of MD simulations to sample properly triplex and TFO configurational space. Following the standard procedure in AMBER5.0 for TI-PME calculations, not only inter-group but also intra-group (non-bonded) and constraint contributions to the free energy were considered. Free energy changes during the different mutations were corrected by the free energy change occurring for the same mutation in the gas phase (computed from 420 ps trajectories using the same simulation protocol). Following this procedure, intra-group contributions to the free energy change during the mutation are removed and the final numbers correspond to the difference between the free energy of solvation (in water or in the duplex) of amino and imino cytosines. All TI calculations were performed using PME and periodic boundary conditions (except obviously TI calculations in the gas phase), and the same technical details for the MD trajectories explained above.

Computational details

All QM calculations were performed using the Gaussian-94 package (33). MD and TI calculations were carried out using AMBER5.0 computer program (43).

RESULTS AND DISCUSSION

Gas phase calculations

B3LYP calculations show the exothermic character of the G-C Hoogsteen hydrogen bonding both following the imino or amino patterns (Table 1). The interaction is quite strong, comparable to that of a canonical A-T pairing [-12.1 at the B3LYP/6-31G(d) level (44), and around -13 kcal/mol from experimental measures (45)], showing the large versatility of the recognition patterns of nucleic acid bases. The partition of the dimerization energy into interaction and distortion term demonstrates that Hoogsteen base pairing produces small distortion into the optimum geometry of the isolated bases.

Table 1. Interaction (Eint), distortion (Edist) and dimerization energies, enthalpies and free energies ([Delta]E, [Delta]G and [Delta]G) for the G-C(w) and G·C(i) pairings in gas phase determined from B3LYP/6-31+G(d,p)//B3LYP/6-31G(d) calculations (see details in text)
Calculation Eint Edist [Delta]E [Delta]H [Delta]G
G·C(w) -11.87 (-12.9) 0.89 -10.98 -9.39 0.50
G·C(i) -11.25 (-12.5) 0.74 -10.51 -8.87 1.34
Reference state for thermodynamic calculations is ideal gas at T = 298 K andP = 1 atm. Values in parentheses correspond to classical estimates obtained with AMBER-95 Force-Field. All values are in kcal/mol.

Interestingly, B3LYP/6-31+G(d,p) calculations show that the wobble pairing is almost 1 kcal/mol more favorable than the imino one (Table 1). This result was also found at lower levels of DFT theory (28), and is also quite well reproduced by AMBER force-field, which gives confidence in the reliability of this force-field. Note that the better binding of amino cytosines to guanine is not expected a priori, since the interaction of G with C(i) involves two hydrogen-bonds while the pair (G-C)w is stabilized only by one (we should note that DFT geometry optimization and MD simulations demonstrated that the wobble pair with a single N4->N7 hydrogen bond is more stable than that with a bifurcated O6[larr]N4->N7 hydrogen bond). The reason for this apparently contradictory result probably lies in the weakness of the O2(G):N6(C) hydrogen bond in the (G-C)i pair (O-N distance of 3.6 Å after geometry optimization in the gas phase).

The relative stability of the (G-C)i and (G-C)w dimers in the gas phase can be further examined considering also the intrinsic difference of stability of the amino and imino tautomers of cytosine. A straightforward comparison of the absolute energies, enthalpies and free energies of the two pairs reveals that both at the B3LYP/6-31G(d) and B3LYP/6-31+G(d,p) levels the wobble pair is more stable than the imino pair by ~3-4 kcal/mol (Table 2). Another estimate can be obtained by combining dimerization energies determined at the B3LYP/6-31+G(d,p) level with the differences in stability between neutral and imino (with the N4-H cis to N3) tautomers of cytosine determined previously at the MP4/6-311++G(d,p) level (34). The high level composite (Table 2) confirms pure B3LYP estimates and suggest that in the gas phase a (G-C)w Hoogsteen-like pair is more stable than the (G-C)i one by ~3-4 kcal/mol.

Table 2. Energy, enthalpy and free energy difference for the (G-C)i->(G-C)w conversion
Wavefunction [Delta]E [Delta]H [Delta]G
B3LYP/6-31G(d) -2.92 -3.17 -2.71
B3LYP/6-31+G(d,p) -4.21 -4.45 -3.99
Compositea -2.90 -3.32 -3.84
A negative sign means that the wobble pair is more stable than the imino pair. Reference state for thermodynamic calculations is ideal gas at P = 1 atm andT = 298 K. All values are in kcal/mol.
aThese numbers were obtained by combining B3LYP/6-31+G(d,p) dimerization energies, enthalpies and free energies with MP4/6-311++G(d,p) estimates of the tautomerization energy, enthalpy and free energy of cytosine.

Molecular dynamics simulations

MD simulations for the two DNA sequences lead to stable trajectories, as noted in the small r.m.s. deviation (<1 Å) with respect to the average conformations obtained by averaging the configurations obtained in the last 500 ps of trajectories (Fig. 4). Analysis of the H-bond scheme during the two trajectories demonstrated that the N4(C)-N7(G) H-bond is maintained during almost the entire (>99%) trajectory, while the N4(C)-O6(G) is found (distance N-O <3.5 Å) during ~13% of the trajectory. These MD results confirm the pattern of wobble H-bonded interactions suggested by gas phase DFT calculations.


Figure 4. r.m.s. deviations (in Å) for the d(AGGGAAGGGA) and d(GAGAGAGAGA) triplexes with amino cytosines at Hoogsteen positions 3 and 8 [d(AGGGAAGGGA)], and 5 [d(GAGAGAGAGA)] with respect to canonical B-type triplex (B, red line), a canonical A-type triplex (A, green line), the structure obtained by averaging the last 500 ps of trajectory (Av., blue line), and the MD-averaged structure obtained (27) for triplexes containing only protonated [d(GAGAGAGAGA)] or protonated/imino [d(AGGGAAGGGA)] cytosines (*, black line). r.m.s. deviation values were computed for each ps and smoothed in 100 ps windows for display.

The two triplexes are more similar to a standard B-type triplex than to an A-type one. Thus the r.m.s. deviations with respect to a canonical B-triplex are 1.4 and 1.5 Å for the d(AGGGAAGGGA) and d(GAGAGAGAGA) triplexes, while the r.m.s. deviations with respect to an A-type triplex are 2.1 and 2.0 Å, respectively. Structures sampled during current trajectories are reasonably close to those obtained previously considering only protonated [r.m.s. = 2.6 Å for the d(GAGAGAGAGA) sequence], and protonated/imino trios [r.m.s. = 1.4 Å for the d(AGGGAAGGGA) sequence]. The larger r.m.s. deviation found for the [d(GAGAGAGAGA)] triplex is not surprising, since the triplex obtained considering only protonated trios deviates from a canonical B-type triplex (r.m.s. deviation ~2.7 Å from ref. 27), reflecting the strong potential generated by protonated cytosines.

Analysis of the helical parameters of the MD-averaged structures shows that the general characteristics of the helices are not particularly dependent on the presence of d(G·C-C)w trios. There are, however, some changes in helical parameters reflecting subtle changes in structure. For instance, the average twist for the d(GAGAGAGAGA) triplex with a wobble trio at position 5 is ~30.5°, the X-disp is around -4.1 Å, and the rise is ~3.3 Å, which compare with those previously found (twist = 31.7°, X-disp = -3.3 Å and rise = 3.4 Å) for the same triplex where all Hoogsteen cytosines were protonated. For the d(AGGGAAGGGA) structure with d(G·C-C)w trios the average twist is ~29.5°, the X-disp is -5.1 Å and the rise is ~3.4 Å. These values compare with those obtained for the same triplex with d(G·C-C)i trios: 31.4° (twist), -1.2 Å (X-disp) and 3.4 Å (rise).

The sugar puckerings are for all the structures in the East and East-South regions (phase angles typically ~115-120°) consistent with a B-type puckering, even though a certain population of C4'-exo puckering is detected in the Hoogsteen strand. Finally, the average size of the three grooves is similar to that found in previous simulations of DNA triplexes (9,16). Thus, for the d(AGGGAAGGGA) triplex with wobble trios the width of the grooves are 8.9 (minor-major), 12.3 (minor) and 15.1 Å (major-major), values which match within <0.5 Å with the values found for the same triplex with imino trios (9). For the d(GAGAGAGAGA) triplex the change of a central d(G·C-C+) trio by a d(G·C-C)w one leads to small changes (~0.5 Å) in the width of the minor (11.8 Å) and minor-major (9.0 Å) grooves, but to a notable shrinking of the major-major groove whose width reduces from 16.1 to 13.4 Å.

Analysis of sequence-dependent features of the two triplexes containing d(G·C-C)w trios reveals that little changes occur at the step with the wobble pairing. In fact, the most relevant changes occur due to the displacement of the cytosine to the minor-major groove (Fig. 5), which leads to a slight increase in the size of the mM groove at the site of the d(G·C-C)w trio. It is also clear a certain change in sugar puckering in the Hoogsteen amino cytosine as noted in an increase of the phase angle ~20-30°, leading to puckerings which are clearly C2'-endo instead of C1'-exo. Beside those changes no major distortions are clear at the steps with the wobble pairing compared with the rest of the helix. Finally, little changes in the solvation atmosphere of the triplexes (data not shown) are found upon introduction of d(G·C-C)w trios (data not shown).


Figure 5. Representative structures obtained during MD simulations of the d(G·C-C)w (in green) and d(G·C-C)i (in red) trios. Structures are displayed after superposition of the Watson-Crick G·C pairs.

In summary, ns MD simulations suggest that the structure of the triplex [with many contiguous d(G·C-C) trios] is consistent with the coexistence of protonated and neutral cytosines, either if the latter are in the imino or in the amino form. The introduction of amino Hoogsteen cytosines does not lead to any dramatic distortion in the structure of the helix, even though it introduces subtle changes in the helical structure.

Thermodynamic integration calculations

TI calculations were performed to determine the difference in `binding' of an imino and an amino cytosine placed in a DNA triplex. As noted in Materials and Methods this implies to perform two mutations, one in a single-strand oligonucleotide (the TFO) and the other in a triple helix. As noted above these simulations were performed in the d(AGGGAAGGGA) triplex and in the d(TCCCT) single-strand oligonucleotide (Fig. 3).

Due to the need to account for intramolecular contributions to the free energy in PME-TI calculations (43), a notable noise is a priori expected in the simulation, which makes the final number susceptible to substantial numerical errors. As a consequence, we were concerned about the reliability of free energy estimates derived from MD-TI calculations. To this end, we repeated each mutation several times, using different simulation procedures or different starting configurations. The results obtained (see below) give strong confidence in the calculations.

For the mutation in the single-stranded DNA, three simulations were conducted, starting from two different configurations (Materials and Methods), and using both 420 and 840 ps simulations. After correction of the intra-group contribution the free energy differences were -6.0 (trajectory of 420 ps starting from the structure obtained after one annealing procedure), -5.9 (trajectory of 420 ps starting from the structure obtained after two annealing procedures) and -5.4 kcal/mol (trajectory of 820 ps starting from the structure obtained after two annealing procedures). It is remarkable the extreme similarity between the free energy estimates obtained from trajectories of very different length (420 and 820 ps), and starting from different configurations, and the lack of discontinuities in the free energy profiles (Fig. 6).


Figure 6. Free energy profile for the mutation between amino [C(a); [lambda] = 0] and imino [C(i); [lambda] = 1] tautomers of cytosine for the TFO and triplex. Profiles are obtained after averaging six independent estimates for each mutation (equilibration and collection estimates for three trajectories in each case), and after correcting the intramolecular term. Standard errors in the average profiles are displayed (in parentheses). See text for details on the simulation.

The free energy change during the imino->amino mutation for a single-stranded DNA corresponds (in an approximate way) to the change in free energy of hydration between imino and amino tautomers of cytosine. The average [Delta][Delta]Ghyd obtained in our MD-TI calculations is -5.7 kcal/mol favoring the hydration of the amino tautomer. This number agrees well with MC-FEP and MST-SCRF estimates obtained for the isolated nucleobase (between -4 and -5 kcal/mol in ref. 34). In fact, if the imino->amino mutation is performed using a single oligonucleotide (820 ps MD-TI simulation) a [Delta][Delta]Ghyd of -4.5 kcal/mol is found, matching our previous estimates, and supporting the quality of present simulations.

For the mutation in the triple helix the main concern is whether or not the conformational changes occurring upon imino->amino mutation are well captured during the MD-TI calculation. Inspection of structures strongly suggests that a 420 ps MD-TI simulation is able to properly reproduce the small conformational change in the triplex due to the change of cytosine from the imino to the amino form. To further assess this point and verify the quality of the free energy estimates, we repeated the mutation three times. The first was done for 420 ps changing from imino to amino cytosines, the second was performed in the same direction for 820 ps, and the last was carried out for 420 ps in the amino->imino direction, and starting from the equilibrated structure of the d(AGGGAAGGGA) triplex with amino cytosines (see above). The three mutations yield free energies (after correction of intra-group contributions) of -0.2 (trajectory of 420 ps starting from the imino structure), -2.1 (trajectory of 420 ps starting from the amino structure), and -1.6 kcal/mol (trajectory of 820 ps starting from the imino tautomer). Again, it is remarkable the similarity between the free energy estimates obtained from trajectories of very different length (420 and 820 ps), and starting from completely different configurations, and the lack of discontinuities in the free energy profiles (Fig. 6).

In summary, despite the technical problems of these type of simulations it is clear that the simulation protocol used is able to provide converged results. Thus, statistical analysis of the results in Figure 6 shows that the mean estimates of free energy change can be obtained with standard deviations of just 0.2 (TFO) and 0.8 (triplex) kcal/mol, and standard errors of just 0.1 and 0.5 kcal/mol for both simulations. It is then possible to subtract free energy changes in Figure 6 to obtain the preferential `binding' of a Hoogsteen imino cytosine in front of a Hoogsteen amino cytosine. Combining the different independent estimates of free energy in Figure 6, we can obtain a differential free energy of 4.5 kcal/mol (standard deviation and error are 0.8 and 0.3 kcal/mol) favoring the binding of an imino cytosine ([Delta][Delta]Gbinding). This notably large difference is quite consistently reproduced in simulations, and seems to be out of the expected statistical errors. Accordingly, our results strongly suggest that the Hoogsteen imino cytosine is much more stabilized by the specific structure of the triplex than the Hoogsteen amino cytosine.

The existence of the (G-C)i pairs implies that the cytosine is in a minor tautomeric form. This means that preferential binding free energies should be considered for the `intrinsic' stability of imino and amino tautomers. Experimental measures (41) suggest that in the gas phase the amino form is ~1.4 kcal/mol more stable than the most stable imino. However, as noted previously (34), the most stable imino form is that with the N4-H trans to N3, and not the imino form which can form the Hoogsteen-like pairing (that with the N4-H cis to N3). High level ab initio data (24) suggest that the imino form of cytosine able to form (G-C)i pairs is 3.0 kcal/mol less stable than the amino form [note that the ab initio procedure used in (34) reproduced the experimental free energy of tautomerization imino/amino with an error of just 0.1 kcal/mol]. Combining these results with MD-TI calculations we conclude that triplexes containing d(G·C-C)i trios are ~1.5 kcal/mol more stable than those containing the d(G·C-C)w trios [at least in triplexes of sequence similar to that considered here (the only ones where the existence of neutral cytosines is expected) at neutral pH and low ionic strength]. That is to say, even though a certain percentage of wobble pairing is possible, the d(G·C-C)i trios are expected to be the major species in neutral d(G·C-C) steps of DNA triple helices.

The greater stability of triplexes with d(G·C-C)i trios noted by MD-TI calculations seems surprising since DFT (and force-field) calculations in the gas phase demonstrate (Table 1) that there is not a preference for the (G-C)i pairing in front of the (G-C)w in terms of H-bonding. Analysis of Table 3 confirms that this is also true for triplexes in solution. That is to say, despite simple reasoning based on the counting of H-bonds (Fig. 2), there is not any advantage in H-bonding between a (G-C)i and (G-C)w pairing.

Table 3. Stacking and H-bond energies (in kcal/mol) for the d(AGGGAAGGGA) triplexes with amino or imino Hoogsteen cytosines at positions 3 and 8
Interaction Species Triplex Trio(3) Trio(8)
H-bond amino -388.9 -167.4 -166.2
imino -389.4 -169.2 -165.6
Stacking amino -216.5 -61.8 -63.3
imino -241.6 -73.0 -72.8
The energies for the entire systems were computed always neglecting the d(A·T-T) trios in the 5[prime] and 3[prime] ends. Stacking and H-bond are computed considering the three strands. The reduced systems [Trio(3), and Trio(8)] were defined as the central 3 or 8 d(G·C-C) (where the C is either amino or imino), and the two flanking d(G·C·C+) trios.

Interestingly, and surprisingly, Table 3 shows that there is a dramatic difference (~25 kcal/mol) in stacking energy between the triplex with two d(G·C-C)i and two d(G·C-C)w steps. Analysis of the stacking energy around positions 3 and 8 demonstrates that ~85% of this difference stems from changes in the stacking in the two GGG sequences related to the tautomeric state of the Hoogsteen cytosines at positions 3 and 8. Due to the magnitude of the difference in stacking we performed QM calculations using our recently 6-31G(d)-parametrized GMIPp method (46-51) to verify the accuracy of AMBER estimates. Calculations were performed on the C(i)-C+ and C(a)-C+ stacked dimers (in the triplex geometry). For these systems the classical stacking energies determined with AMBER force-field are +2.1 (amino) and -6.6 (imino) kcal/mol, which compare well with GMIPp values (+2.6 and -5.3 kcal/mol, respectively). The results support then the quality of AMBER5.0 to reproduce stacking interactions in this type of system, in agreement with findings reported by other authors (52).

Partition of the total stacking energy into van der Waals and electrostatic terms demonstrated that dispersion interactions are ~2 kcal/mol more favorable for the stacking of an imino cytosine between two protonated cytosines than that of a neutral amino cytosine. The reason for this difference probably lies in the best overlap of Hoogsteen cytosines in a d(G·C-C)i trio compared with the d(G·C-C)w one (Fig. 7). However, the most important reason for the stacking preference of imino cytosines lies in the electrostatic contribution which favors the presence of imino cytosines in the Hoogsteen position of neutral d(G·C-C) trios by ~8-9 kcal/mol. Such a large difference is due to a better fit of its charge distribution in the strong potential generated by the two flanking protonated cytosines.


Figure 7. Detail of the average structures obtained in trajectories of triplexes containing imino (top) and amino (bottom) cytosines in the Hoogsteen position. For the sake of clarity only the imino/amino Hoogsteen cytosine, the corresponding guanine (both in red) and the Hoogsteen protonated cytosines at 3[prime] and 5[prime] (in green) are displayed.

In summary, MD-TI simulations strongly suggest that the d(G·C-C)i trios lead to a larger stabilization of the triplex than the d(G·C-C)w, and are more likely to exist in triplexes where not all the d(G·C-C) steps are protonated. Analysis of MD-averaged structures strongly suggests that the stacking term is responsible for the greater stability of the d(G·C-C)i trios.

CONCLUSIONS

Quantum mechanical calculations have shown that the wobble Hoogsteen pairing between guanine and amino cytosine is slightly more stable in the gas phase than that between guanine and imino cytosine. This result combined with high level QM estimates of the amino[harr]imino tautomeric equilibrium suggests that in the gas phase the neutral Hoogsteen d(G-C)w pair will be more stable than the d(G-C)i one.

MD simulations in the ns time scale have demonstrated the d(G·C-C)w trio is compatible with the existence of triplex helices at least for two sequences: d(GAGAGAGAGA) and d(AGGGAAGGGA), provided there is a large content of d(G·C-C+) trios in the triplex.

MD calculations, coupled to TI techniques and QM calculation, strongly suggest that the triplex d(AGGGAAGGGA) with a Hoogsteen imino cytosine at position 8 is slightly more stable than that with a neutral amino cytosine at the same position. A careful analysis of the data demonstrates that the better stacking of imino cytosines is responsible for the greater stability of d(G·C-C)i trios compared with the d(G·C-C)w ones.

All the results support the idea that a small portion of neutral d(G·C-C)i [and perhaps also d(G·C-C)w] trios can coexist with d(G·C-C+) trios in d(G·C-C)-based triplexes, at least in sequences rich in poly(G) fragments at neutral pH and low ionic strength.

ACKNOWLEDGEMENTS

We thank Professor C. Laughton, Dr C. González and Dr J. L. Asensio for many discussions and ideas. We also thank Dr T. Darden for help in free energy calculations. This work has been supported by the Centre de Supercomputació de Catalunya (CESCA, Molecular Recognition Project), and by the Spanish DGICYT (Projects PB96-1005 and PB97-0908). R.S. thanks the CIRIT for a predoctoral fellowship. This is a contribution of the Centre Especial de Recerca en Química Teòrica.

REFERENCES

1. Cooney,M., Czernuszewicz,G., Postel,E.H., Flint,S.J. and Hogan,M.E. (1988) Science, 241, 456-459. MEDLINE Abstract

2. Hélène,C. and Toulme,J.J. (1990) Biochim. Biophys. Acta, 1049, 99-125.

3. Strobel,S.A. and Dervan,P. (1992) Methods Enzymol., 216, 309-321. MEDLINE Abstract

4. Grigoriev,M., Praseuth,D., Guieysee,A.L., Robin,P., Thuong,N.T., Hélène,C. and Harel-Bellan,A. (1993) Proc. Natl Acad. Sci. USA, 90, 3501-3505. MEDLINE Abstract

5. Sun,J.S. and Hélène,C. (1993) Curr. Opin. Struct. Biol., 3, 345-356.

6. Soyfer,V.N. and Potaman,V.N. (1996) Triple-Helical Nucleic Acids. Springer, New York.

7. Macaya,R.F., Schultze,P. and Feigon,J. (1992) J. Am. Chem. Soc., 114, 781-783.

8. Howard,F.B., Miles,H.T., Liu,K., Frazier,J., Raghunathan,G. and Sasisekharan,V. (1993) Biochemistry, 31, 10671-10677.

9. Radhakrishnan,I. and Patel,D.J. (1994) Biochemistry, 33, 11405-11416. MEDLINE Abstract

10. Radhakrishnan,I. and Patel,D.J. (1994) Structure, 2, 17-32. MEDLINE Abstract

11. Bornet,O. and Lancelot,G. (1995) J. Biomol. Struct. Dyn., 12, 803-814.

12. Wang,E., Koshlap,K.M., Gillespie,P., Dervan,P.B. and Feigon,J. (1996) J. Mol. Biol., 257, 1052-1069.

13. Koshlap,K.M., Schultze,P., Brunar,H., Dervan,P.B. and Feigon,J. (1997) Biochemistry, 36, 2659-2668. MEDLINE Abstract

14. Arnott,S., Bond,P.J., Selsing,E. and Smith,P.J.C. (1976) Nucleic Acids Res., 3, 2459-2470. MEDLINE Abstract

15. Betts,L., Josey,J.A., Veal,J.M. and Jordan,S.R. (1995) Science, 270, 1838-1841. MEDLINE Abstract

16. Asensio,J.L., Dhesai,J., Bergquist,S., Brown,T. and Lane,A.N. (1998) J. Mol. Biol., 275, 811-822.

17. Asensio,J.L., Brown,T. and Lane,A.N. (1999) Structure, 7, 1-11. MEDLINE Abstract

18. Asensio,J.L., Brown,T. and Lane,A.N. (1998) Nucleic Acids Res., 26, 3677-3686.

19. Asensio,J.L., Bosangh,H.S., Jenkins,T.C. and Lane,A.N. (1998) Biochemistry, 37, 15188-15198. MEDLINE Abstract

20. Lee,J.S., Johnson,D.A. and Moogan,A.R. (1979) Nucleic Acids Res., 6, 3073-3091. MEDLINE Abstract

21. Rajagopal,P. and Feigon,J. (1989) Biochemistry, 28, 7859-7870. MEDLINE Abstract

22. Rajagopal,P. and Feigon,J. (1989) Nature, 339, 637-640. MEDLINE Abstract

23. Völker,J. and Klump,H.H. (1994) Biochemistry, 33, 13502-13508. MEDLINE Abstract

24. Wittung,P., Nielsen,P. and Nordén,B. (1997) Biochemistry, 36, 7973-7979. MEDLINE Abstract

25. Leitner,D., Schröder,W. and Weisz,K. (1998) J. Am. Chem. Soc., 120, 7123-7124.

26. Plum,G.E. and Breslauer,K.J. (1995) J. Mol. Biol., 248, 679-695.

27. Soliva,R., Laughton,C.A., Luque,F.J. and Orozco,M. (1998) J. Am. Chem. Soc., 120, 11226-11233.

28. Güimil,R., Bachi,A., Eritja,R., Luque,F.J. and Orozco,M. (1998) Bioorg. Med. Chem. Lett., 8, 3011-3016.

29. Lee,C., Yang,W. and Parr,R.G. (1998) Phys. Rev. B, 37, 785-802.

30. Becke,A.D. (1993) J. Chem. Phys., 98, 5648-5652.

31. Stevens,P.J., Devlin,F.J., Chablowski,C.F. and Frisch,M.J. (1994) J. Phys. Chem., 98, 11623-11627.

32. Boys,S.F. and Bernardi,F. (1970) Mol. Phys., 19, 553-574.

33. Frisch,M.J., Trucks,G.W., Schelgel,H.B., Gill,P.M.W., Johnson,B.G., Robb,M.A., Cheeseman,J.R., Keith,T.A., Petersson,G.A., Montgomery,G.A., Raghavachari,K., Al-Laham,M.A., Zakrzewski,V.G., Ortiz,J.V., Foresman,J.B., Ciolowski,J., Stefanov,B.B., Nanayakkara,A., Challacombe,M., Peng,C.Y., Ayala,P.Y., Chen,W., Wong,M.W., Andres,J.L., Replogle,E.S., Gomperts,R., Martin,R.L., Fox,D.J., Binkley,J.S., Defrees,D.J., Baker,J., Stewart,J.P., Head-Gordon,M., Gonzalez,C. and Pople,J.A. (1995) GAUSSIAN94. Gaussian Inc., Pittsburg, PA.

34. Colominas,C., Luque,F.J. and Orozco,M. (1996) J. Am. Chem. Soc., 188, 6811-6821.

35. Jorgensen,W.L., Chandresekhar,J., Madura,J., Impey,R.W. and Klein,M.L. (1983) J. Chem. Phys., 79, 926-935.

36. Shields,G., Laughton,C.A. and Orozco,M. (1997) J. Am. Chem. Soc., 119, 7463-7469.

37. Darden,T.E., York,D. and Pedersen,L. (1993) J. Chem. Phys., 98, 10089-10092.

38. Essmann,V., Perera,L., Berkowitz,M.L., Darden,T., Lee,H. and Pedersen,L.G. (1995) J. Chem. Phys., 103, 8577-8593.

39. Ryckaert,J.P., Ciccotti,G. and Berendsen,H.J.C (1977) J. Comp. Phys., 23, 327-342.

40. Cornell,W.D., Cieplak,P., Bayly,C.I., Gould,I.R., Merz,K., Fergurson,D.M., Spellmeyer,D.C., Fox,T., Caldwell,J.W. and Kollman,P. (1995) J. Am. Chem. Soc., 117, 5179-5197.

41. Dreyfus,M., Bensaude,O., Dodin,G. and Dubois,J.E. (1976) J. Am. Chem. Soc., 93, 6338-6349.

42. Biosym Co. (1998) Biopolymer Module of INSIGHTII, San Diego, CA.

43. Case,D.A., Pearlman,D.A., Caldwell,J.W., Cheatham,T.E., Ross,W.S., Simmerling,C.L., Darden,T.A., Merz,K.M., Stanton,R.V., Cheng,A.L., Vincent,J.J., Crowley,M., Ferguson,D.M., Radmer,R.J., Seibel,G.L., Singh,U.C., Weiner,P.K. and Kollman,P.A. (1997) AMBER5.0, modified by Darden,T.A. (1999) University of California, San Francisco, CA.

44. Güimil,R., Ferrer,E., Macías,M.J., Eritja,R. and Orozco,M. (1999) Nucleic Acids Res., 27, 1991-1998.

45. Yanson,I.K., Teplitsky,A.B. and Sukhodub,L.F. (1979) Biopolymers, 18, 1149-1156. MEDLINE Abstract

46. Orozco,M. and Luque,F.J. (1993) J. Comp. Chem., 14, 587-603.

47. Alhambra,C., Luque,F.J. and Orozco,M. (1995) J. Phys. Chem., 99, 3084-3092.

48. Luque,F.J. and Orozco,M. (1998) J. Comp. Chem., 19, 866-881.

49. Orozco,M., Luque,F.J. (1996) In Murray,J. and Sen,K. (eds), Molecular Electrostatic Potentials: Concepts and Applications. Theoretical and Computational Chemistry, Vol 3. Elsevier, Amsterdam, The Netherlands.

50. Cubero,E., Luque,F.J. and Orozco,M. (1998) Proc. Natl Acad. Sci. USA, 95, 5976-5980. MEDLINE Abstract

51. Hernández,B., Luque,F.J. and Orozco,M. (1999) J. Comp. Chem., in press.

52. Hobza,P., Kabelac,M., Sponer,J., Mejzlik,P. and Vondrasek,J. (1997) J. Comp. Chem., 18, 1136-1150.


*To whom correspondence should be addressed. Tel/Fax: +34 93 4021219; Email: modesto@luz.bq.ub.es


This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: jnl.info{at}oup.co.uk
Last modification: 14 May 1999
Copyright©Oxford University Press, 1999.

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
B. Hernandez, R. Soliva, F. J. Luque, and M. Orozco
Misincorporation of 2'-deoxyoxanosine into DNA: a molecular basis for NO-induced mutagenesis derived from theoretical calculations
Nucleic Acids Res., December 15, 2000; 28(24): 4873 - 4883.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Soliva, R. Guimil Garcia, J. R. Blas, R. Eritja, J. L. Asensio, C. Gonzalez, F. J. Luque, and M. Orozco
DNA-triplex stabilizing properties of 8-aminoguanine
Nucleic Acids Res., November 15, 2000; 28(22): 4531 - 4539.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (275K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (13)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Soliva, R.
Right arrow Articles by Orozco, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Soliva, R.
Right arrow Articles by Orozco, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?