Crystal structure of 2'-O-Me(CGCGCG)2, an RNA duplex at 1.30 Å resolution. Hydration pattern of 2'-O-methylated RNA
Crystal structure of 2 ' - O -Me(CGCGCG) 2 , an RNA duplex at 1.30 Å resolution. Hydration pattern of 2 ' - O -methylated RNADorota A. Adamiak*, Jan Milecki1, Mariusz Popenda, Ryszard W. Adamiak, Zbigniew Dauter2 and Wojciech R. Rypniewski2
Institute of Bioorganic Chemistry, Polish Academy of Sciences, Noskowskiego 12/14, 61-704 Poznan, Poland, 1Faculty of Chemistry, Adam Mickiewicz University, Poznan, Poland and 2European Molecular Biology Laboratory, c/o DESY, Hamburg, Germany
Received July 14, 1997;Revised and Accepted October 6, 1997
NDB coordinates ARFS26
ABSTRACT
The molecular and crystal structure of 2'-O-Me (CGCGCG)2 has been determined using synchrotron radiation at near-atomic resolution (1.30 Å), the highest resolution to date in the RNA field. The crystal structure is a half-turn A-type helix with some helical parameters deviating from canonical A-RNA, such as low base pair rise, elevated helical twist and inclination angles. In CG steps, inter-strand guanines are parallel while cytosines are not parallel. In steps GC this motif is reversed. This type of regularity is not seen in other RNA crystal structures.The structure includes 44 water molecules and two hydrated Mg2+ ions one of which lies exactly on the crystallographic 2-fold axis. There are distinct patterns of hydration in the major and the minor grooves. The major groove is stabilised by water clusters consisting of fused five- and six-membered rings. Minor groove contains only a single row of water molecules; each water bridges either two self-parallel cytosines or two self-parallel guanines by a pair of hydrogen bonds. The structure provides the first view of the hydration scheme of 2'-O-methylated RNA duplex.
INTRODUCTION
RNA forms a wide range of functionally important tertiary structural domains containing both single- and double-stranded regions such as hairpins, bulge loop duplexes, pseudoknots or hammerheads. The tendency to form double-helical regions plays a crucial role in the folding of these complex structures. Double-stranded RNA helices exist principally in the right-handed A-form. Averaged helical parameters for this canonical form of RNA have been deduced from fibre diffraction data (1 ). Since then, single crystal analysis of dinucleotide monophosphates (2 ,3 ) and refined tRNA structures (4 ,5 ) have beenreported showing more details of local effects of base sequence. Thanks to recent advances in chemical (6 ) and enzymatic (7 ) oligoribonucleotide synthesis the number of X-ray RNA duplex structures is increasing. However, to our knowledge, only a dozen duplex structures have been reported to date. The first was r[U(UA)6A]2, at 2.25 Å resolution, with two unexpected kinks in the helix (8 ). Other duplex structures (9 -16 ), usually form approximately one-turn helices. The resolution of these structures, some of which contain more than one mismatch, varies from 1.6to 2.6 Å. Most recently, structure of a 160 nt long domain of a class I intron at 2.8 Å (17 ) and the hammerhead ribozymes (18 ,19 ) have been reported.
Our continued interest in RNA duplexes containing alternative CG base pairs was prompted by the finding that poly[r(CG)] (20 ) and r(CGCGCG)2 (21 ) are able to form left-handed RNA double helices named Z-RNA. This intrinsic property of (CG)n duplexes is interesting in view of the structural properties of RNA and RNA hydration. The latter phenomenon is known to govern helicity reversal processes (20 ,21 ). Recently, it was found that the A -> Z transition of (CG)n duplexes is also promoted by high pressure (22 ). In contrast, 2'-O-Me(CGCGCG)2 does not undergo helicity reversal under similar experimental conditions and remains in the A-form (23 ). This points to the influence of 2'-O-methylation on the ability of RNA duplexes, containing alternating CG base pairs, to undergo such a conformational reversal.
To date, there is no X-ray RNA structure containing alternating CG base pairs which would allow analysis of both CG and GC steps. Two high resolution structures, r(UUCGCG)2 (24 ) and r(CCCCGGGG)2 (25 ) solved at 1.40 and 1.46 Å, respectively, have been reported recently. Although they contain CG base pairs, the duplex structures formed do not allow an analysis of both CG and GC steps. The r(CCCCGGGG)2 (25 ) has only one CG step and no GC steps. In the crystal of UUCGCG hexamer, a duplex structure with four CG base pairs and two overhanging UU base pairs is formed (24 ). Only the GC step lies in the interior of the structure while the two CG steps contain terminal guanosine residues. Due to crystal symmetry relating the two UUCGCG strands the structure contains only one CG step and one GC step with acrystallographic 2-fold axis lying in the middle of the GC step which means that only one half of the GC step is unique.
Unfortunately, our extensive attempts to get suitably diffracting monocrystals of r(CGCGCG)2 duplex have been unsuccessful.
In this work we present the crystal structure of the 2'-O-Me (CGCGCG)2 duplex at 1.30 Å resolution. The near-atomic resolution of the X-ray data allowed very accurate structure determination with anisotropic thermal parameters and detailed analysis including the hydration scheme and magnesium binding. The overall structure is an A-RNA helix, but with certain structural features which deviate from the canonical form. The results obtained also provide an insight into the effect of 2'-O-methylation on hydration of RNA duplex.
In the accompanying paper we described the NMR structure of r(CGCGCG)2 and 2'-O-Me(CGCGCG)2 under low salt conditions. Surprisingly, the two right-handed duplex structures are similar, despite 2'-O-methylation, with an average r.m.s.d. of 1.0 Å. This suggests that it is the intrinsic properties of alternating CG base pairs that govern both RNA duplex structures. The data allow for comparison of the structure of a 2'-O-methylated RNA duplex in the crystalline state and solution. We hope that the results, for the 2'-O-methylated analogue, will bring us closer to the understanding of the structure of native (CG)n sequences.
MATERIALS AND METHODS
Oligoribonucleotide crystals
Hexamer 2'-O-Me(CGCGCG) was prepared by automated solid-phase synthesis using phosphoramidite chemistry (26 ). Duplex crystals were grown at 20°C by hanging drop/vapour diffusion. After an extensive search several monocrystals were obtained under the following conditions: 5 mg/ml of RNA in 50 mM HEPES buffer pH 7.5, 15 mM MgCl2, 1 mM spermine tetrahydrochloride and 30-40% 2-methyl-2,4-pentanediol as precipitating agent. No crystals of the respective octamer could be obtained under these or similar conditions.
Crystallographic data collection and processing
X-ray diffraction data were collected from a single crystal on the EMBL BW7B wiggler beam line at the DORIS storage ring, DESY, Hamburg, with a Mar Research imaging plate scanner. The crystal was mounted with the long c-axis along the goniometer spindle axis. Due to the highly elongated unit cell this orientation was essential to avoid overlaps of diffraction intensities. Three data sets were collected: at long, medium and short exposures, to record intensities at high, medium and low resolution. The intensities were integrated using the program DENZO and scaled using program SCALEPACK (27 ). Outliers were rejected based on the chi-square test implemented in SCALEPACK. The post-refinement option was used to refine the cell parameters. The X-ray data are summarised in Table 1 .
Structure solution and refinement
The solvent content of the crystal was calculated to be 47% assuming RNA density 1.7 g/cm3 (28 ) and one duplex per asymmetric unit. The structure was solved by molecular replacement as implemented in the program AMORE (29 ) from the CCP4 program suite (30 ). The solution was obtained using residues U6 to A11 of chain A and U4 to A9 of chain B from the U(UA)6 A duplex RNA structure (9 ,10 , PDB code 1RNA) as the starting model, which corresponded to the (UA)3 core of the duplex. No further editing was done on the starting model at this stage. The rotation function was calculated using terms between 8 and 2.5 Å, with a Patterson search radius of 12 Å. The 50 highest peaks of the rotation function did not show any clear candidates for the correct solution. The correlation coefficient decreased smoothly from 0.40 to 0.34 for the first 19 peaks. Further peaks had a correlation coefficient of ~0.2. The translation function was calculated for each of the first 20 peaks in the rotation function. It was not known at this stage if the space group was P6122 or P6522 and the translation function was calculated for both space groups. The best solution was obtained for P6122, with correlation coefficient 0.67. The second highest peak had correlation coefficient 0.54 and the average value was ~0.5. After rigid body refinement the correlation coefficient for the highest peak was 0.74, for the second highest peak 0.58 and the average value was still ~0.5. The model was positioned in the unit cell according to the highest peak and (3Fo - 2Fc) and (Fo - Fc) difference maps were inspected. It became clear that this was the correct solution, with the (3Fo - 2Fc) map showing good overall agreement with the model although considerable deviations were observed for the terminal base pairs. The difference map (Fo- Fc) showed most of the 2'-O-methyl groups and exo-amino groups of C and G, not included in the model. There were no bad intermolecular contacts.
The structure was refined by stereochemically restrained least-squares minimisation as implemented in the program SHELXL93 (31 ). The integrated diffraction intensities between 8 and 1.30 Å were used in the refinement, rather than the structure factor amplitudes. The geometric restraints were derived from the standard dictionary used in the CCP4 program suite (30 ). Planarity restraints were imposed on the guanine and the cytosine rings, as well as restraints on bond lengths and bond angles. In the later stages of refinement the bond angle restraints were removed. The initial model was modified to reflect the correct chemistry and the non-hydrogen atoms that were initially absent in the model were easily found in the electron density maps. Cycles of least-squares refinement were interspersed with rounds of manual rebuilding, based on (3Fo - 2Fc) and (Fo - Fc) maps, using an Evans and Sutherland ESV graphics station and the program FRODO (32 ). Initially only non-hydrogen atoms were included in the model and isotropic temperature factors were refined. When hydrogen atoms became visible in the (Fo - Fc) map they were included in the model and anisotropic temperature factors were refined. Solvent molecules were inserted manually, based on examination of electron density maps. Refinement was terminated when it was felt that no further significant improvement in the model could be achieved. The final R-factor ([Sigma][brvbar][brvbar]Fo[brvbar] - [brvbar]Fc[brvbar][brvbar]/[Sigma][brvbar]Fo[brvbar]) was 0.175. The refinement was performed using the conjugate gradient algorithm. After the refinement was completed, one additional cycle of minimisation was executed using the full-matrix least-squares method in order to obtain direct estimates of errors in atomic positions and temperature factors. To decrease the size of the computation, the model was divided into three blocks, two containing one RNA strand each and one block for the solvent atoms. All restraints and shift damping were removed in that cycle. Helical parameters were calculated (33 ) using program CURVES 5.11. The SGI Indigo2 workstation was used for visualisation applying the InsightII/Biopolymer software package (MSI).
aRmerge = [Sigma][brvbar]Ii - <I>[brvbar]/[Sigma]<I>, where Ii is an individual intensity measurement, and <I> is the average intensity for this reflection with summation over all the data.
RESULTS AND DISCUSSION
The refined structure
The high quality of the crystals and the use of synchrotron radiation enabled refinement of the model of the 2'-O-Me(CGCGCG)2 duplex structure with an exceptional accuracy. Anisotropic thermal vibrational parameters and direct estimates of standard deviations for atomic positions and the temperature factors have been obtained. A measure of reliability of refinement is gained from examining the data-to-parameter ratio (d/p). For the refined model of 2'-O-Me(CGCGCG)2 the final value of d/p was 2.9, with anisotropic temperature factors. When restraints are taken into account the d/p value becomes effectively higher. Thus the refinement process is well determined even with anisotropic temperature factors. The model of the 2'-O-methylated RNA duplex is complete. It also includes 44 solvent water molecules and two Mg2+ ions, one of which lies exactly on the 2-fold crystallographic axis. The second magnesium site is only partially occupied but is clearly recognisable by its octahedral coordination. No other ordered sites were found for Mg2+ or spermine, which were both present and necessary for crystallisation.
Crystal packing. The asymmetric unit contains one duplex. The crystal lattice consists of infinite columns of hexamer double helices stacked head-to-tail, perpendicular to the c-axis. The columns are arranged in layers, with a 60° rotation between layers. Two stacked hexamers form one full turn of a helix (Fig. 1 ). The requirement of crystallographic symmetry means that there are exactly 12 bp per turn of the column.
Accuracy of the coordinates. The overall estimated standard deviation (e.s.d.) for atomic positions is 0.090 Å. For the RNA atoms it is 0.077 Å (oxygen 0.063 Å, carbon 0.091 Å, nitrogen 0.062 Å, phosphorus 0.033 Å) and for the water molecules 0.146 Å.
Structural features of 2'-O-Me(CGCGCG)2
The overall structure. In the crystal, self-complementary hexamer 2'-O-Me(CGCGCG) forms approximately one half-turn of an RNA helix (Fig. 2 ). Both strands are related by non-crystallographic 2-fold symmetry axis with r.m.s.d. values of 0.19 Å. The overall structure is an A-type RNA helix but with certain deviations from the canonical, fibre RNA structure (1 ). The deviationismuch lower in the crystal structure (r.m.s.d. 1.30 Å) than in solution (r.m.s.d. 1.8 Å). The duplex is overwound and contains 10 bp per helical turn. The difference from the number of base pairs per turn in the crystal packing is due to dislocation in the intermolecular helix stacking (co-axial stacking). The NMR structures of r(CGCGCG)2 and 2'-O-Me(CGCGCG)2 described in the accompanying paper are closely similar. The r.m.s.d. between the 2'-O-Me(CGCGCG)2 X-ray structure and the solution structure is 1.7 Å. The possible causes are the influence of crystal packing and differences in refinement methods and restraints.
Magnesium binding
Only two different hydrated magnesium sites were located in the crystal lattice, at half occupancy each. Due to the high quality and resolution of the X-ray data it is unlikely that additional ordered magnesium sites exist unobserved. They most probably are delocalised, forming dispersed cation shield within the structure. One magnesium cation lies exactly on the crystallographic 2-fold axis and bridges two symmetry related molecules through two C5 phosphate oxygens (O2P) and four water molecules (Fig. 5 ). The 2'-hydroxyl function has been found in several crystal structures to play a crucial role in RNA-RNA water-mediated intermolecular contacts (10 ,16 ,24 ,25 ). However, in this structure 2'-OH groups are methylated and blocked as H-bond donors. Instead, the magnesium cation and the two 2'-oxygens play the pivotal role in bridging symmetry related molecules in the lateral, column-to-column direction. The other magnesium site is not fully occupied in the crystal lattice and was refined at an occupancy factor 0.5. It is coordinated by one of the oxygens of the G2 phosphate and five water molecules; one of them is in close contact of 2.69 Å (H-bond?) to a symmetry related G4 O1P. Both magnesium cations are located near duplex ends and it is possible that they contribute to the `edge effects' seen in the helical parameters, rise and roll, for the terminal base steps (Table 3 ).
Sugar, backbone and glycosidic torsion anglesa for 2'-O-Me(CGCGCG)2 structure from the X-ray study (left column). The results from the NMR refinementb (see accompanying paper) are quoted for comparison (right column)
Residue
[alpha]
[beta]
[gamma]
[delta]
[epsilon]
[zeta]
[chi]
1st
2nd
strand
C1
-
-
-
-
41
72(9)
93
86(4)
202
197(4)
288
285(4)
202
201(5)
C11
-
-
52
76
219
280
193
G2
297
277(9)
176
180(4)
46
70(7)
82
87(3)
215
203(4)
291
291(5)
193
202(4)
G12
291
177
51
93
203
287
195
C3
277
267(7)
172
177(5)
61
75(5)
68
88(4)
219
201(4)
290
288(5)
197
213(5)
C13
292
170
49
72
217
290
198
G4
288
262(8)
177
179(5)
55
79(6)
75
93(3)
215
208(4)
285
294(4)
200
207(5)
G14
293
176
54
76
215
290
198
C5
293
247(8)
173
177(5)
54
85(6)
72
82(3)
207
201(4)
296
288(4)
196
206(5)
C15
293
171
52
73
211
291
197
G6
291
274(8)
182
173(4)
63
72(6)
75
82(4)
-
-
-
-
201
191(5)
G16
296
174
63
72
-
-
199
Mean
291
265
175
177
53
75
77
86
212
202
288
289
198
203
A-RNAc
186
49
95
202
294
202
aP a O5' b C5' g C4' d C3' e O3' z P.
bStandard deviations are given in parentheses.
cRefs 1,35,36.
Helical parametersa for 2'-O-Me(CGCGCG)2 structure as resulted from X-ray (left column) and NMR (right column) studies
Base pairs
x-Displacement (Å)
Inclination (°)
Propeller twist (°)
C1-G12
-4.8
-3.4
19
23
-6
-5
G2-C11
-4.8
-3.2
19
26
-16
-17
C3-G10
-4.9
-3.3
21
28
-12
-26
G4-C9
-5.0
-3.2
22
30
-8
-38
C5-G8
-5.0
-3.3
21
24
-6
-25
G6-C7
-5.0
-3.1
22
20
-7
-23
Base steps
Rise (Å)
Twist (°)
Roll (°)
C1-G2
3.2
2.3
40
34
-4
0
G2-C3
2.2
2.4
34
37
6
-3
C3-G4
2.2
2.6
34
42
4
6
G4-C5
2.2
2.6
38
41
4
-6
C5-G6
2.9
2.7
37
34
-4
1
aParameters were calculated (33) with program CURVES 5.11.
Hydration pattern of 2'-O-methylated RNA duplex
The major groove of 2'-O-Me(CGCGCG)2 is narrow but deep and the minor groove is broad and shallow. The 2'-O-methyl groups point towards the minor groove (Fig. 2 ). The duplex grooves are hydrated in a very regular way, with the majority of ordered waters located in the major groove. Although no hydrogen atoms were observed directly for the water molecules, the position of the hydrogens could be deduced for many waters in the first hydration shell within ~3.5 Å from RNA sites (Fig. 6 ). This was possible because of the special regular arrangement of the waters and the chemical nature of the groups hydrated. The pattern of hydration in this structure can be compared to the related crystal structure of r(CCCCGGGG)2 at 1.46 Å resolution; the only relevant case for which the RNA duplex hydration scheme has been presented in detail (25 )
ACKNOWLEDGEMENTS
This work was supported by grants from the State Committee for Scientific Research, Republic of Poland (2P30300704 and 8T11F010008p08). D.A.A. gratefully acknowledges a research fellowship from the Alexander von Humboldt Foundation. The generous support of the Foundation for Polish Science and the Poznan Supercomputing and Networking Centre is acknowledged. Authors thank Prof. Richard Lavery for kindly supplying us with program CURVES 5.11 and Mieczyslawa Kluge for technical assistance.
REFERENCES
1 Arnott,S., Hukins,D.W.L., Dover,S.D., Fuller,W. and Hodgson,A.R. (1973) J. Mol. Biol., 81, 107-112.MEDLINE Abstract
2 Seeman,N.C., Rosenberg,J.M., Suddath,F.L., Kim,J.J.P. and Rich,A. (1976) J. Mol. Biol., 104, 109-144.MEDLINE Abstract
3 Rosenberg,J.M., Seeman,N.C., Day,R.O. and Rich,A. (1976) J. Mol. Biol., 104, 145-167.MEDLINE Abstract
4 Holbrook,S.R., Sussman,J.L., Warrant,R.W. and Kim,S.H. (1978) J. Mol. Biol., 123, 631-660.MEDLINE Abstract
5 Hingerty,B., Brown,S.S. and Jack,A. (1978) J. Mol. Biol., 124, 523-534.MEDLINE Abstract
6 Usman,N., Ogilvie,K.K., Jiang,M-J. and Cedergren,R.J. (1987) J. Am. Chem. Soc.,109, 7845-7854.
22 Krzyzaniak,A., Barciszewski,J., Furste,J.P., Bald,R., Erdmann,V.A., Salanski,P. and Jurczak, J. (1994)Int. J. Biol. Macromol., 16, 159-162.MEDLINE Abstract
23 Krzyzaniak,A., Salanski,P., Adamiak,R.W., Jurczak,J. and Barciszewski,J. (1996) In Hayashi,R. and Balny,C. (eds), High Pressure Bioscience and Biotechnology. Elsevier Science, Amsterdam, pp. 189-194.