ABSTRACT
Here we describe template directed enzymatic synthesis of unique primers, avoiding the chemical synthesis step in primer walking. We have termed this conceptually new technique DENS (differential extension with nucleotide subsets). DENS works by selectively extending a short primer, making it a long one at the intended site only. The
procedure starts with a limited initial extension of the primer (at 20-30
o
C) in the presence of only two out of the four possible dNTPs. The primer is
extended by 6-9 bases or longer at the intended priming site, which is deliberately
selected, (as is the two-dNTP set), to maximize the extension length. The subsequent termination reaction at 60-65
o
C then accepts the extended primer at the intended site, but not at alternative
sites, where the initial extension (if any) is generally much shorter. DENS
allows the use of primers as long as 8mers (degenerate in two positions) which
prime much more strongly than modular primers involving 5-7mers and which (unlike the latter) can be used with thermostable
polymerases, thus allowing cycle-sequencing with dye-terminators compatible with
Taq
DNA polymerase, as well as making double-stranded DNA sequencing more robust.
The success of the Human Genome Project depends on the development of rapid and
inexpensive technology for DNA sequencing, which will also benefit biomedical
research in general. The currently favored shotgun strategy for DNA sequencing
has two main bottlenecks: template preparation, and assembly of the sequence
contigs. Primer walking minimizes both of these problems and also reduces the
redundancy of sequencing by several fold. However the walking strategy has its
own bottleneck in primer synthesis, which is expensive, slow, and-most critically-complicates full automation. It was originally proposed to
eliminate the primer synthesis step by using presynthesized libraries of
primers of different sequences (
1
). A common rationale behind most library-based priming techniques is that since the scale of primer synthesis
exceeds the amount used in a conventional sequencing reaction by a factor of
million or so, thousands of usable copies of the library can be aliquoted from
a single synthesis. More importantly, the instant availability of primers makes
possible complete automation of the closed cycle of primer walking. Closed-end automation would speed up primer walking by a factor of 30-50 and decrease the cost of DNA sequencing by about one order of
magnitude. However the problem is that even the shortest primer expected to be
unique in a plasmid-sized template (a nonamer), has a library of unmanageable size (262,144 possible sequences). To reduce the library to a manageable
size (e.g. 4096 possible hexamers), individual short oligonucleotides (each too
short to prime uniquely when alone) were either ligated (
2
-
6
), or assembled without ligation (
7
-
14
) to give unique (long) primers.
In this paper, a more powerful technique is presented which utilizes a
conceptually different approach to making an inherently non-unique primer act as a unique one. We have called this new method DENS for
`differential extension with nucleotide subsets'. The DENS method is based on
two key elements (Fig.
1
) as follows.
Unmodified oligonucleotides were supplied by DNAgency (Malvern, PA, USA) and by
the synthesis service of the Weizmann Institute of Science. The 3'-end-protected heptamers, degenerate in two positions, were
synthesized by Biotechnology General (Ness-Ziona, Israel). The 3'-end protection group, 3'-phosphate propyl esther (Glen Research,
Sterling, VA, cat no. 20-2913-10) was linked to the heptamers during synthesis. SequiTherm, ThermoSequenase, AmpliTaq FS and related
reagents were from the respective sequencing kits (Epicentre technologies cat.
no. S20100, Amersham cat. no. US78500 and Perkin-Elmer cat. no. 402118). Deoxribonucleotide triphosphates were from
Pharmacia LKB, Sweden. Six different two-dNTP mixes (AC, AG, AT, CG, CT and GT) which contained 40 pmol/[mu]l of each dNTP were made and stored at -20oC.
The DENS sequencing reactions were performed in two steps: (i) differential extension with either SequiTherm or ThermoSequenase at 20oC; and then (ii) termination at 60-65oC with ThermoSequenase or AmpliTaq FS.
Fluorescent sequencing reactions were performed as follows.
(i) Differential extension step. Each 12 [mu]l reaction contained: 0.5 pmol of single-stranded (ss) M13mp18 template (Amersham, cat. no. US 70704); 300-400 pmol of the degenerate octamer; 1.0 [mu]l of the two dNTP mix (40 pmol of each) selected for each particular site; 1.0 [mu]l of 10* SequiTherm Sequencing Buffer (0.5 M Tris-HCl, pH 9.3 and 25 mM MgCl2
) and 2.5 U SequiTherm. The differential extension was performed using 20 cycles of: denaturation at 90oC for 30 s, fast cooling to 20oC, and extension at 20oC for 2 min. After the 20 cycles, SequiTherm was inactivated at
100oC for 10 min.
(ii) Fluorescent dideoxy termination step. AmpliTaq FS fluorescent termination
mix was made according to the AmpliTaq FS kit manual. ThermoSequenase
fluorescent termination mix was made according to
Amersham's
ThermoSequenase Dye Terminator Cycle Sequencing Protocol. The fluorescent termination mix was heated to 70oC before adding 8.5 [mu]l of it to the differential extension products (also at 70oC) and thermocycled as follows: 60 or 65oC (see text) for 4 min, fast heating to 95oC, incubation at 95oC for 30 s, fast cooling back to 60oC or 65oC; 20 cycles overall. (The temperature
should not drop below 60oC after the addition of the termination mix until the reaction is stopped.) The polymerase was then inactivated at 100oC for 10 min. The sequencing products were purified as in (
9
) and analyzed on either a model ABI 373 or ABI PRISM 377 automatic sequencer.
Radioactive sequencing reactions were performed as follows.
(i) Limited extension step. The reaction volume was 12 [mu]l, containing 0.25 pmol ss M13mp18 template; 150-200 pmol of the degenerate octamer or 10 pmol of a control 15mer primer; 5 pmol of each of the two dNTPs, and 1.5 [mu]l of the reaction buffer concentrate from the kit corresponding
to the polymerase to be used. The reaction mixture was incubated at 90oC for 3 min and placed immediately in a 20oC water bath. SequiTherm (5 U), or Thermo Sequenase (4 U) was then
added, and the reaction allowed to proceed at 20oC for 10 min. One of the two dNTPs was radio-labeled ([[alpha]-32
P]dATP, Amersham, 3000 Ci/mmol, unless specified otherwise). Alternatively, the radio-label can be incorporated at the termination stage using any available radio-labeled dNTP, regardless of the differential extension dNTP subset.
(ii) Termination step. The differential extension products pre-warmed at 60oC were aliquoted (2 [mu]l) into each of the pre-warmed termination mixes (2 [mu]l at 60oC), and another 2 [mu]l into 7 [mu]l of formamide Stop Solution (for analysis
of the differential extension products on a gel). The termination mixes were
taken from the kits corresponding to the enzymes used. The termination
incubation was at 60oC for 15 min, after which 5 [mu]l of formamide Stop Solution was added. The differential extension products were electrophoresed on a denaturing 12% polyacrylamide gel, and the termination reaction products on a 6% gel. In other
radiolabeled sequencing experiments not described in this paper, thermocycling
was used at both steps, as described above for fluorescent sequencing.
Figures
1
,
2
and
3
illustrate the mechanism of DENS using an example of one of many possible
primer sequences. The sequencing reactions of Figure
2
A, B and C were primed on ss M13mp18 template by the octamer 5'-NNGGAAGG-3' which has two degenerate positions (N=A+C+G+T). Sequencing
ladders A and B of Figure
2
were produced by the same octamer, but are clearly different, priming uniquely
at two different sites (at positions 2668 and 5592 respectively) by virtue of
extensions with different two-dNTP subsets. The differential extensions were performed at 20oC, a temperature at which octamers can anneal and prime. In both A
and B, at the intended site, the primer was extended by eight bases, each with
a different two-dNTP subset. In contrast, at the alternative sites, in each case the
octamer was extended by no more than four bases with the same dNTP subsets, and
therefore did not prime there at the termination stage performed at the higher
temperature, 60oC (see flow chart in Fig.
1
). As one would expect, the same octamer used in a conventional sequencing
reaction (i.e. where the initial extension step contained all four dNTPs)
produced an unreadable band pattern, the result of multiple priming (Fig.
2
C). Under the same conditions, a control 15mer primer produced clear readable
sequence ladders whether two or four dNTPs were used at the initial extension
step (Fig.
2
D and E).
Figure
4
shows how DENS can be combined with the modular primer technique (
7
-
14
) to improve the specificity and strength of priming. The two panels of Figure
4
A show sequencing reactions with DENS primed by a heptamer, 5'-NCCGATT-3', alone and (as a `front' module) in combination with
two `back' modules (also heptamers), which together form a 7+7+7 modular
primer. The front heptamer's differential extension products happen to be long
enough at two sites (at the intended position 2681 by 13 bases and at an
alternative position 5410 by seven bases, see Fig.
4
B), both actively priming at the 60oC termination stage. Hence, the front module alone produces a superimposition of the two
sequences, shown on the panel marked `alone'. The adjacent panel, `+7+7', shows
that the addition of two heptamers contiguous to the intended site made the
band pattern unique and much stronger. Figure
4
B shows the products of the differential extension (with two dNTPs) of this
heptamer used with and without the back modules (each aliquoted both before and
after the termination reaction). The 20mer product from the intended site
becomes dramatically stronger in the presence of the additional 7mers
contiguous to that site, whereas the 14mer product at the alternative site
almost disappears. This phenomenon, termed `modular primer effect' (
14
), is believed to be caused by preferential engagement of the polymerase by
longer primers (whether modular or not, e.g. 7+7+7=21 bases here) at the
expense of shorter ones (e.g. 7mer alone). We found ThermoSequenase to exhibit
a much stronger modular primer effect than SequiTherm. Note that in Figure
4
B the short products are not utilized at the termination stage, just as in
Figure
3
, again demonstrating the DENS mechanism. However, DENS sequencing with an 8mer
alone (unlike with a 5-, 6- or 7mer alone), without the back modules, seems almost as
successful as it is with them (8+7+7).
Figure
We have tried 67 fluorescent sequencing reactions with differential extensions by six bases or longer using 8mers priming throughout ss M13mp18
template with dye-terminators in automated cycle-sequencing (with SequiTherm at the differential extension stage and
either Thermo Sequenase or AmpliTaq FS at the termination stage). The priming
sites were chosen without any knowledge of their performance in radioactive
sequencing. Of them, 46 (69%) gave high quality sequence, the first base-calling error normally occurring after base 500. Of the 21 that failed, 17
were blank or too weak and four unreadable (dirty). Interestingly, we have
hardly seen a result that was intermediate between high quality and failure. It
has been shown that modular primer failures can be caused by unfavorable local
secondary structure in the template (
8
) and we are currently working on a computer program which can identify the
sites to be avoided. Figure
6
shows the output of a Model 373 automated sequencer for one of the primers.
Here, the use of DENS involving a differential extension with the A+G subset of
dNTPs made the degenerate octamer primer yield the sequence primed at position
2679 only, even though the octamer has two more complementary sites on ss
M13mp18 and no back modules were used.
Figure
We also made an initial attempt to test double-stranded DNA as a template. We used DENS with single octamer primers (no
contiguous modules) and fluorescent dye-terminators to partially sequence a 2.9 kb insert of bacterial DNA cloned
into pUC18. Out of 13 reactions performed with SequiTherm at the differential
extension stage and AmpliTaq FS at the termination stage, six gave high quality
sequence, two were readable but dirty, and five were either blank or too weak.
The actual success rate of DENS is even higher than it seems. Failures due to
the signal being weak or undetectable are not the fault of DENS
per se
, and can be remedied by more sensitive detectors. The ABI sequencers we used
register any signal below a certain threshold as `blank'. Radio-labeled sequencing shows most such apparent failures to be in fact
successes. Most conditions and procedures in the DENS technique are yet to be
optimized, which is expected to further improve the success rate. Other possible future improvements include duplex-stabilizing base- modifications which have been found to enhance modular primer performance (
15
). The addition of an inosine at the 5'-end of a single heptamer primer was also found to have a signal-enhancing effect (not shown). Possibly, this may result from
increasing either the primer length (and thus the acceptance by the polymerase)
or the annealing stability through an extra base stacked to the 5'-end (not unlike ref.
16
), or both.
We can estimate the expected failure rate of DENS (due to the superimposition of
the priming signals from the intended and alternative priming sites). With a
two-dNTP subset, each extra base in the extension length reduces the
probability of the occurrence of an extension of such a length by half. We have
found that the threshold imposed by the termination reaction temperature of 60oC is a 5-base extension. The probability of extending a primer at a given site
by 5 bases or longer with a given subset of two dNTPs is 2-5
= 1/32. This is the proportion of the alternative sites that are expected to
interfere with the sequencing signal from the intended site and thus give rise
to failures due to unreadable (superimposed) sequences. For a typical double degenerate 8mer, such as those used here (with the specificity of a
hexamer, i.e. the 6 non-degenerate bases of the octamer), the average number of alternative sites
in a 7-10 kb long plasmid is less than three, in which case the expected failure
rate of DENS is 3/32, <10%. This theoretical expectation is close to the proportion of failures that
we indeed find due to dirty signal (as opposed to failures due to undetectable
or too weak fluorescent signal). In practice, the vector sequence is known, and
about half or more of DENS failures can be avoided by not using primers which
can be extended on the vector (as well as on the known part of the insert
sequence) beyond the 5-base threshold at the differential extension step.
In DENS, subsets of two dNTPs are preferable to those of three dNTPs. With the
former, most alternative sites do not get extended beyond the 5-base threshold to work at the high temperature of the termination stage.
With the latter, too many do, thus giving rise to an unacceptably high failure
rate.
In order for a library to be of reasonable size, the non-degenerate part of the primer should not exceed six bases (
8
). As primers for DENS, we currently prefer octamers with two degenerate
positions each. Heptamers seem to prime more weakly than octamers: even a 7+7+7
modular primer generally primes more weakly than an octamer alone. On the other
hand nonamers with two degeneracies would necessitate too big a library,
whereas three degeneracies reduce the effective concentration of the matched
primer by a factor of 1/64, which may be too much. Apart from that, as compared
to an 8mer, a 9mer has twice the proportion of alternative sites crossing the
threshold of the differential extension length.
The high temperature thermocycling termination step was found to work best with
either ThermoSequenase or AmpliTaq-FS. On the other hand, at the differential extension stage, SequiTherm
seems to produce stronger, though occasionally dirtier signal as compared to
ThermoSequenase. (A series of experiments with modular primers showed that
ThermoSequenase has a much more pronounced modular primer effect than
SequiTherm, data not shown.) Because of the stronger signal, we prefer SequiTherm at the differential extension stage of fluorescent sequencing.
The DENS technique depends on the availability within a given stretch of
template of a priming site which will give a sufficiently long extension with
two dNTPs. We can estimate the frequency of such sites in a random sequence as
follows: since there are six possible subsets of two dNTPs, the probability of
a primer being extended at a given site by more than four bases with any one of
the six possible subsets of two dNTPs is 6/32 or about 1/5. Therefore, about
every 5 bases on average, a site is found in the template where the primer can
be sufficiently extended with one of the six possible subsets. Even if only a
quarter of the complete library of primers is used, one suitable site per 20
bases of a random template should be found, a sufficient frequency for primer
walking sequencing with DENS. The less random the template sequence is (e.g.
G/C or A/T rich), the higher the occurrence of suitable sites. Software that
searches for suitable sites is available upon request, as is information on the
availability of a double-degenerate octamer primer library. A distinction should be made between
DENS and the primer walking methods described earlier with partial libraries of
8mers or 9mers (
17
-
19
). The probability of occurrence of an appropriate priming site is higher using
DENS with double-degenerate octamers than using the same number of non-degenerate octamers or nonamers.
Originally, with short front modules such as 5mers and 6mers, we found that the
combination of modular primers with DENS provided a remarkable increase in the
site-specificity of priming, as compared to using modular primers alone. The
increased specificity made it possible to use longer front modules (e.g. 8mers degenerate in two positions) than were typically used in modular primers
without DENS (
7
-
13
), where unique priming could only be achieved with shorter front modules (5- or 6mers). The longer front modules make the priming much stronger, and
in contrast to the shorter ones, can work with thermostable polymerases. Thus
8mers, unlike 5mers or 6mers, made cycle sequencing possible, thereby further
increasing the priming signal intensity many-fold. No less important is the more robust and reliable performance of dye-terminators for
Taq
DNA polymerase as compared to those for SequenaseTM
(submitted for publication). However, for longer front modules such as in this
paper (e.g. octamers with two degenerate positions each), DENS is essential
because for the most part they fail to achieve unique priming by means of the
modular primer effect only, i.e. without DENS. Furthermore, DENS essentially
obviates the need for additional modules in the case of octamers, whose priming
strength or specificity is rarely improved by the modular primer effect.
We appreciate DOE grant No. DE-FG02-94ER61831, contract No. 960892402 between Argonne National
Laboratory and the Weizmann Institute of Science, and the help of Dr Maura
Devine in editing the manuscript.
*
To whom correspondence should be addressed at: Center for Mechanistic Biology
and Biotechnology, Argonne National Laboratory, 9700 South Cass Avenue,
Argonne, IL 60439-4833, USA. Tel: +1 630 252 3940; Fax: +1 630 252 3387; Email: levy@anl.gov


REFERENCES
Return

