Template directed incorporation of nucleotide mixtures using azole-nucleobase analogs
Template directed incorporation of nucleotide mixtures using azole-nucleobase analogsGeoffrey C. Hoops, Peiming Zhang1, W. Travis Johnson, Natasha Paul, Donald E. Bergstrom1 and V. Jo Davisson*
Department of Medicinal Chemistry and Molecular Pharmacology, 1333 Robert Heine Pharmacy Building, Purdue University, West Lafayette, IN 47907-1333, USA and 1Walther Cancer Institute, Indianapolis, IN 46208, USA
Received November 4, 1997;Accepted November 5, 1997
ABSTRACT
DNA that encodes elements for degenerate replication events by use of artificial nucleobases offers a versatile approach to manipulating sequences for applications in biotechnology. We have designed a family of artificial nucleobases that are capable of assuming multiple hydrogen bonding orientations through internal bond rotations to provide a means for degenerate molecular recognition. Incorporation of these analogs into a single position of a PCR primer allowed for analysis of their template effects on DNA amplification catalyzed by Thermus aquaticus (Taq) DNA polymerase. All of the nucleobase surrogates have similar shapes but differ by structural alterations that influence their electronic character. These subtle distinctions were able to influence the Taq DNA polymerase dependent incorporation of the four natural deoxyribonucleotides and thus, significantly expand the molecular design possibilities for biochemically functional nucleic acid analogs.
Modified nucleobases that can function by degenerate recognition of natural nucleic acids would be invaluable tools for nucleic acid manipulation and applications in protein engineering. Nucleobases that show loss of discrimination when participating in DNA replication are rare and include only a few purine or pyrimidine derivatives. 8-Hydroxyguanine (1 ), 2-hydroxyadenine (2 ), 6-O-methylguanine (3 -5 ) and xanthine (4 ,6 ) direct the incorporation of (C and A), (T and A), (T and C) and (T and C), respectively, in ratios that are highly polymerase dependent. Pyrimidine analogs O2-ethylthymidine and O4-ethylthymidine have been shown to direct the incorporation of A and T or A and G using the Klenow DNA polymerase I (7 ). Rationally designed non-discriminate bases such as 2-amino-6-methoxyaminopurine (K base), 6H, 8H-3,4-dihydropyrimido[4,5-c][1,2]oxazin-7-one (P base) (8 ), and N6-methoxyadenine (Z base) (9 ,10 ) have provided oligonucleotide sequences with relaxed base pairing specificity when assessed in melting studies. However, when these tautomeric bases are used in DNA templates for enzyme catalyzed replication, they preferentially direct the incorporation of just one natural purine or pyrimidine deoxyribonucleotide (11 ).
Our approach toward the development of degenerate nucleic acid bases is founded on the principles of steric and conformational freedom for critical molecular recognition elements. Illustrated in Figure 1 is the conceptual evolution of a series of 1,3-azole-carboxamide heterocycles from inosine, a nucleotide that can base pair with A and C. A family of potential nucleobase analogs are represented within the general substituted azole heterocycles; as shown in Figure 1 this subclass includes eight different structures. A range of hydrogen bonding donor and acceptor patterns mediating nucleobase pairing are possible for these azoles as displayed in Figure 2 . Internal bond rotations about the glycosidic bond and carboxamide side chain define alternative placement of ring nitrogens through which the azolecarboxamides can mimic the hydrogen bonding patterns of each of the natural nucleobases.
A full account of the syntheses of the nucleobase analogs phosphoramidite precursors for the oligonucleotides in Table 1 will be the subject of forthcoming articles. The 2'-deoxyribonucleosides containing PrN3, PrA3, ImA4 and PzA3 were synthesized as previously described (15 -17 ). Addition of the 5'-dimethoxytrityl protecting group followed by the 3'-phosphoramidite to the 2'-deoxyribonucleosides provided precursors for oligonucleotide synthesis. The A, C, G, T and I phosphoramidites were purchased from Biogenics. The abasic phosphoramidite was obtained from Glenn Research. A preparation of the nucleoside corresponding to PzA4 followed from the sodium salt of ethyl pyrazole-4-carboxylate which was glycosylated with [alpha]-1-chloro-O3,O5-ditoluoyl-2'-deoxyribose. Treatment of the glycosylation product with methanolic ammonia at 120°C for 2 weeks afforded 1-([beta]-d-2'-deoxyribosyl)-pyrazole-4-carboxamide, PzA4. The anomeric configuration of PzA4 was assigned as [beta] on the analysis of the 1H-NMR signals for the anomeric and 2' protons of the corresponding 5'-dimethoxytrityl derivative (18 ,19 ) [1H-NMR (p.p.m. in DMSO-d6): 2.22-2.27 (1H, m, H-2''), 2.58-2.62 (1H, m, H-2'), 6.13 (1H, dd, J1',2' = 6.5 Hz, J'14,24 = 4.5 Hz, H-1')].
Oligonucleotide sense strand primers for PCR of E.coli hisF
G, guanine; #, Azole 1-5which include the deoxyribonucleotides of PrN3, PrA3, PzA4, ImA4, PzA3, respectively (Fig. 1); I, hypoxanthine; M, a mixture of the four naturally occurring bases A, C, G and T; B, an abasic site.
Sense strand 40mer primers for PCR corresponding to positions -10 to 30 of the Escherichia colihisF gene in the expression vector phisF-tac (Table 1 ) were prepared. The 30mer antisense primer was composed entirely of natural bases (Table 1 ). All oligonucleotide primers were prepared either (i) using an ABI 380B or 392 DNA synthesizer in the Laboratory for Macromolecular Structure at Purdue University, or (ii) by Midland Certified Reagent Company (Midland, TX) using phosphoramidites described above. The PCR primer containing an equal distribution of purine/pyrimidine bases at position #36 was prepared by mixing an equal amount of four primers. All primers were purified by preparative gel electrophoresis under denaturing conditions (20 ) (7 M urea, 50-55°C) on 20% (19% acrylamide, 1% N,N'-methylenebisacrylamide) polyacrylamide gels (1.4 mm * 30 cm * 40 cm) in TBE (90 mM Tris-borate, 2 mM EDTA) buffer on a Gibco BRL model S2 apparatus at 1000 V. Crude oligonucleotide (trityl off) from a 200 nmol scale synthesis was dissolved in 50 µl TE8 (10 mM Tris-HCl pH 8.0, 1 mM EDTA) and diluted 2-fold by addition of 50 µl 2* formamide load buffer (80% formamide, 10 mM EDTA, 0.05% bromophenol blue). Gels were preequilibrated at 1000 V for 1-2 h before loading sample. After sample loading, gels were run at constant voltage until the bromophenol blue tracking dye had moved through 50-75% of the gel length. The gel was removed from the glass sandwich and bands were visualized by shadowing with a handheld short-wavelength UV (ultraviolet) lamp. The target oligonucleotide band was excised with a sharp scalpel. The purified oligonucleotide was isolated from the polyacrylamide gel using the `crush and soak' method (20 ,21 ), followed by phenol/chloroform extraction and ethanol precipitation. The purified primer was resuspended in TE8 (150 µl). Primer concentrations were estimated from UV absorbance at 260 nm as 200-fold dilutions in TE8 on a Cary 3 UV-visible spectrophotometer (Varian).
The 40mer oligonucleotides containing modified bases (Table 1 ) served as sense strand primers for polymerase chain reactions (PCRs) catalyzed by the Taq DNA polymerase (Amplitaq from Perkin-Elmer). Template DNA for PCR was produced by PvuII digestion of the expression vector phisF-tac (22 ). The following conditions for PCR were found to consistently and accurately amplify the template DNA containing the hisF gene: 100 µl reactions containing 100 mM Tris (pH 7.8 at 25°C), 50 mM KCl, 10 µg/ml gelatin, 1.5 mM MgCl2, 100 pmol sense-strand primer, 100 pmol antisense-strand primer, fmol quantities of template DNA and 2.5 U Amplitaqtm in 500 µl microcentrifuge tubes are topped with 75 µl mineral oil and subjected to 31 thermal cycles of PCR (95°C, 1 min; 37°C, 1 min; 70°C, 5 min; repeat) followed by an additional 10 min extension at 70°C in a PTC-100 thermocycler (MJ Research). Prior to purification, an initial analysis for successful amplification of the template DNA was made by agarose gel electrophoresis (21 ) of a portion (5 µl) of the dsDNA product mixture against molecular weight standards (Sigma). The products were purified either (i) directly from the reaction mixtures using the Wizardtm PCR Preps kit (Promega) or (ii) via preparative agarose gel electrophoresis with isolation of the dsDNA product employing the GeneCleantm kit (Bio 101). The practical yields of these PCR products after 31 cycles with Taq DNA polymerase were similar regardless of the sense primer content.
These PCR products were sequenced via the dideoxy chain termination method in a cycle sequencing protocol employing sense strand primers (corresponding to positions -10 to 6 and -10 to 13 of the hisF gene) and the PCR-amplified antisense strand as a template. Direct dsDNA cycle sequencing of the antisense strand after PCR provided a determination of the percent incorporation of dAMP, dCMP, dGMP and dTMP opposite the universal base during the PCR reaction. The cycle sequencing was carried out with either (i) the fmol Sequencing Systemtm (Promega) using a 5'-33P-end-labeled primer and Wizardtm-purified template, or (ii) the Thermosequenasetm kit (Amersham) using [[alpha]-33P] ddNTPs, unlabeled primer, and gel-purified template. The best signal/background ratios for this study were observed with the latter sequencing procedure. The manufacturers' protocols for the cycle sequencing reactions were employed with the exception of the annealing temperature (57°C) used in the thermal cycle.
Quenched cycle sequencing reaction products were analyzed by gel electrophoresis under denaturing conditions (7 M urea, 50-55°C) on 8% polyacrylamide gels (0.4 mm * 30 cm * 40 cm) in TTE buffer (90 mM Tris-taurine, 0.5 mM EDTA) on a Gibco BRL model S2 apparatus at constant power (85 W). Gels poured using a molded silicone gel casting clamp (Gibco BRL) tended to result in significantly less `smiling' of the gel during electrophoresis, which facilitated the quantitative analysis of the imaged data. The gels were preequilibrated for 1-2 h prior to loading samples. For best results, the samples (2.5 µl) were loaded either (i) in every second lane using sharkstooth combs, or (ii) in wells using a well comb. Gels were run until the bromophenol blue tracking dye had moved through 75% of the gel length. The gels were transferred from the glass sandwich to blotting paper (Whatman), covered with Saran Wraptm, and dried in vacuo on a Bio-Rad model 583 gel dryer for 2 h at 80°C.
The distributions of bases incorporated opposite the candidate universal bases in the PCR primer were quantified by phosphorimaging. Detection was carried out by 2-3 day exposures of dried sequencing gels on BI imaging plates (Bio-Rad). The image data was imported with a GS363 plate scanner (BioRad). Data work-up was performed on an Apple Macintosh 7500/100 using Molecular Analysttm software from Bio-Rad. A one-dimensional graphic profile extending horizontally across the sequencing gel at the position of azole base incorporation was extracted. Several background profiles in the vicinity of the experimental data were also extracted, averaged and subtracted from the experimental profile. The background-subtracted experimental profile was smoothed resulting in a trace with four peaks, corresponding to pixel density in the A, C, G and T lanes, which were subsequently integrated. The integration data were exported to Microsoft Excel for calculation of percent incorporation and error analysis. Note that the sense strand is being sequenced in this experiment, using antisense DNA as template. Pixel density in the A lane therefore corresponds to incorporation of dTMP opposite the template azole base by Taq DNA polymerase.
The distribution analysis of dNTP incorporation into replicating DNA strands is based upon DNA sequencing of primer-modified PCR products. Each azole base nucleoside was incorporated at position 36 in 40 base oligonucleotide PCR primers encoding positions -10 to 30 on the sense strand of E.coli hisF gene in the phisF-tac vector (22 ) (Table 1 ). When used at low stringency annealing temperatures (37°C), all cases of the hisF PCR primers with modifications including inosine, abasic or azole nucleobases allowed for successful amplification of the template DNA by Taq DNA polymerase. Dideoxy cycle sequencing of the replicated antisense strand using a common sense strand primer allowed for quantitative assessment of the base composition via phosphorimaging of the resultant polyacrylamide gels. Figure 3 shows an example of the imaged DNA sequence data in which the direct comparison of a positive control (lane 1) can be made with those containing azole carboxamide nucleobases. The data in Table 2 represent averages from eight gel results from four independent sequencing reactions for each of nine distinct combinations of template and PCR primer. The validity of the analysis is supported by the even distribution of nucleotide incorporation observed when an equimolar mixture of four oligonucleotide primers, representing all four natural bases at position 36, are used as the sense strand primer. The mutagenic rate for base substitution errors by Taq DNA polymerase under the conditions of this assay are reported (23 ) to be in the range of 1/105, and thus approximately three orders of magnitude less than the error (up to 9%) observed for this experimental approach.
Figure 3. Analysis of the base distribution via Sanger dideoxy sequencing of the antisense DNA for PCR products containing (a) a mixture of all four natural bases, (b) PzA3 and (c) PzA4 in the sense strand. The arrows mark the horizontal cross section of the gel at the point where the mixture of nucleotides was incorporated into the antisense DNA opposite the azole base. Note that in Table 2 the reported percentages of dNMPs incorporated opposite the azole bases were calculated as the inverse of the gel results (i.e., a band in the G lane on the sequencing gel was assumed to arise from the incorporation of dCMP opposite the azole base).
Nucleotide incorporation directed by templates containing candidate
The percentage incorporation of each dNMP by the Taq DNA polymerase. These values are the average of eight experiments, whereby the PCR, the dideoxy cycle sequencing reactions and polyacrylamide gel electrophoresis were performed in duplicate for each base-analog-containing oligonucleotide. The standard error of these values is given in parentheses.
The Taq DNA polymerase consistently incorporated dAMP at the specified position to the near exclusion of any other nucleotide in templates containing an abasic site. Under the same experimental conditions, the Taq DNA polymerase incorporated both dCMP and dAMP opposite hypoxanthine in approximately a 5.5:1 ratio. These results differ slightly from those of Kamiya et al. (4 ), where only incorporation of dCMP opposite hypoxanthine by Taq DNA polymerase was observed. However, the results reported here are consistent with the observed relative stabilities of I-C and I-A base pairs from thermodynamic measurements (24 ,25 ). As a contrast to the azole carboxamides, the base analogue PrN3 was designed to maximize stacking interactions, and not participate in specific hydrogen-bonding interactions with the naturally occurring bases (15 ). A relaxed degree of specificity was observed in this case since both dAMP and dTMP (in a 3:1 ratio, respectively) were incorporated opposite the nitropyrrole base PrN3 by Taq DNA polymerase.
The carboxamide side chain was used to substitute the pyrrole nucleus to give PrA3 on the premise that it may adopt multiple hydrogen bond donor-acceptor patterns through internal bond rotations (16 ). This modification resulted in a shift of the specificity toward T incorporation relative to the nitro substituted pyrrole PrN3. Addition of a second ring nitrogen into position 2 was not expected to present any additional hydrogen bond acceptor for base pairing but would affect the electronic properties relative to the azole base PrA3. This alteration in the heterocycle can be detected in the A and T ratio of incorporation by Taq DNA polymerase; template containing PzA4 favored A over T in contrast to PrA3. A simple position change for an endocyclic nitrogen alters the known electronegativity and dipole moment of the azole. This feature in ImA4 was predicted to impart at least two base pairing conformations that differ through intramolecular hydrogen bonds (16 ). However, the observed A to T incorporation ratios for the imidazole base ImA4 templates indicated that the directing properties of this base mirror those values observed for the pyrrole nucleus PrA3.
The results with pyrazole-3-carboxamide (PzA3) are a contrast to those discussed above. In this study, PzA3 behaved like a universal purine analogue, directing approximately the equivalent incorporation of dCMP and dTMP by Taq DNA polymerase (Table 2 ). In comparison to other purine analogs, dTMP was qualitatively observed (3 ,4 ) to be incorporated by the Taq DNA polymerase opposite 6-methoxyguanine with greater frequency than dCMP. The Taq DNA polymerase also incorporated dTMP with much greater frequency than dCMP opposite 2-amino-6-methoxyaminopurine (11 ). As a template for in vitro DNA replication catalyzed by thermostable DNA polymerases, PzA3 appears to be the most promising candidate as a universal purine analogue examined to date.
The thermal stability of oligonucleotide sequences containing a single substitution of inosine has defined the role for this nucleoside as a probe for degenerate DNA hybridization (26 ,27 ). An expansion of this universal nucleoside recognition concept is represented by the use of PCR primers containing inosine and the various azole base analogs. The results using this assay system are interpreted for the application of the azole bases as degenerate templates in PCR. Hypoxanthine base appears to direct the incorporation of nucleotides in a manner consistent with its hydrogen bonding base-pairing attributes. However, the studies presented here indicate that the biochemical behavior of the azole nucleobase analogs in DNA templates are in contrast to inosine since they cannot be predicted solely on the basis of their ability to stabilize duplex DNA. All of the azole nucleobases bear a similar overall shape and size which allows for a comparison of the template effects within a single DNA sequence context. There are suprisingly dramatic effects on the template specificity revealed upon alteration of the heterocycle electronic properties. These simple nitrogen substitutions indicate a role for the electronic character of the azole nucleobase in dictating the selective incorporation of nucleotide triphosphates on the leading strand.
As a design approach, the azole nucleobases display important advantages for degenerate recognition in DNA replication. These analogs are fundamentally distinct from purine/pyrimidines whose base pairing is dictated by a precise balance of conformational and tautomeric equilibria. Subtle structural changes in the azole nucleobases result in large changes of the template recognition by Taq DNA polymerase consistent with a molecular event that is chemically tunable. The combined effects of ring electronics and a conformationally dynamic functional group expands the potential for nucleobase analog design. While none of the nucleobase analogs discussed here were fully degenerate, a prediction can be made on the basis of this study that useful incorporation of nucleotide mixtures at a single position by Taq DNA polymerase could be achieved and applied to oligonucleotide mediated site directed mutagenesis. It is fully anticipated that differences among DNA polymerases are potentially revealed by use of these nucleobase analogs as recently displayed for alternate hydrogen bonded base pairing patterns in another unique series (29 ). A full sequence context study is beyond the scope of this initial study, and there exists the possibility that the absolute base specificity could also be influenced by position. Regardless, the dependence of the azole base electronic character within a sequence context is significant impetus for further investigations of the physical, chemical and biological properties of DNA templates containing these nucleobase analogs.
One of the striking features of these results is the strong bias that Taq polymerase shows for inserting A opposite modified nucleobases or an abasic site. The tendency for A insertion opposite an abasic site or nucleobase analog by DNA polymerases has been observed previously, but the molecular basis for this effect remains poorly resolved (29 -31 ). A theoretical hydrogen bonding scheme predicts that none of the modified bases included in this initial study, with the exception of inosine, should base pair selectively with A. In azole nucleobases, the simple nitropyrrole PrN3, and three of the carboxamide substituted derivatives, PrA3, PzA4 and ImA4 incorporate A at substantial levels. More important to the application of nucleobase analogs is the property displayed by PzA3 to overcome the strong preference for a DNA polymerase to insert A. These results provide a striking confirmation of our hypothesis that substituted carboxamide azoles are a unique class of compounds which show distinct, multiple biological recognition features. The mechanisms by which these nucleobase analogs operate are not clearly established at this time. However, it is apparent that because the azole nucleobases all have similar shapes, the potential exists for establishing molecular parameters for DNA polymerase-template recognition through analysis of their structure-activity relationships.
Recently, the 2'-deoxyribonucleotide triphosphate derivative of ImA4 was investigated as a potential substrate for Taq DNA polymerase (32 ). Misincorporation frequencies were reported that are comparable to those of other hypermutagenic methods. The occurrence of tranversions as well as transitions suggest that alternate conformations of the imidazole nucleobase are important in enzyme recognition. In combination with the studies shown herein, these results establish a basis for the future development of azole-nucleobase analogs displaying useful molecular recognition properties in enzymatic methods for DNA synthesis.
We thank Thomas Klem for initial studies of the template properties of the base analogs as well as Doug Klewer for his insights during the development of the work presented herein. We also thank the NIH GM53155 for financial support.
13 Lawyer,F.C., Stoffel,S., Saiki,R.K., Myambo,K., Drummond,R. and Gelfand,D.H. (1989) J. Biol. Chem., 264, 6427-6437.MEDLINE Abstract
14 Tindall,K.R. and Kunkel,T.A. (1988) Biochemistry, 27, 6008-6013.MEDLINE Abstract
15 Bergstrom,D.E., Zhang,P., Toma,P.H., Andrews,P.C. and Nichols,R. (1995) J. Am. Chem. Soc.,117, 1201-1209.
16 Bergstrom,D.E., Zhang,P. and Johnson,W.T. (1996) Nucleosides, 15, 59-68.
17 Ramasamy,K., Robins,R.K. and Revankar,G.R. (1986) Tetrahedron, 42, 5869-5878.
18 Srivastava,P.C., Robins,R.K., Takusagawa,F. and Berman,H.M. (1981) J. Heterocycl. Chem., 18, 1659-1662.
19 Rousseau,R.J., Robins,R.K. and Townsend,L.B. (1970) J. Heterocycl. Chem., 7, 367-372.
20 Chen.Z. and Ruffner,D.E. (1996) BioTechniques, 21, 820-822.MEDLINE Abstract
21 Sambrook,J., Fritsch,E.F. and Maniatis,T. (1989) Molecular Cloning: A Laboratory Manual. Cold Spring Harbor Laboratory Press, Cold Spring Harbor, NY.
22 Klem,T.J. and Davisson,V.J. (1993) Biochemistry, 32, 5177-5186.MEDLINE Abstract
T. M. Hitchcock, L. Dong, E. E. Connor, L. B. Meira, L. D. Samson, M. D. Wyatt, and W. Cao Oxanine DNA Glycosylase Activity from Mammalian Alkyladenine Glycosylase
J. Biol. Chem.,
September 10, 2004;
279(37):
38177 - 38183.
[Abstract][Full Text][PDF]
C. L. Hendrickson, K. G. Devine, and S. A. Benner Probing minor groove recognition contacts by DNA polymerases and reverse transcriptases using 3-deaza-2'-deoxyadenosine
Nucleic Acids Res.,
April 23, 2004;
32(7):
2241 - 2250.
[Abstract][Full Text][PDF]
D. Loakes SURVEY AND SUMMARY: The applications of universal DNA base analogues
Nucleic Acids Res.,
June 15, 2001;
29(12):
2437 - 2447.
[Abstract][Full Text][PDF]
D. A. Klewer, A. Hoskins, P. Zhang, V. J. Davisson, D. E. Bergstrom, and A. C. LiWang NMR structure of a DNA duplex containing nucleoside analog 1-(2'-deoxy-{beta}-D-ribofuranosyl)-3-nitropyrrole and the structure of the unmodified control
Nucleic Acids Res.,
November 15, 2000;
28(22):
4514 - 4522.
[Abstract][Full Text][PDF]
M. Berger, Y. Wu, A. K. Ogawa, D. L. McMinn, P. G. Schultz, and F. E. Romesberg Universal bases for hybridization, replication and chain termination
Nucleic Acids Res.,
August 1, 2000;
28(15):
2911 - 2914.
[Abstract][Full Text][PDF]