A simulation of subtractive hybridizationTae-Ju Cho* and Sang-Soo Park
Department of Biochemistry, College of Natural Sciences, Chungbuk National University, Cheongju 361-763, Korea
Received November 21, 1997;Revised and Accepted January 6, 1998
ABSTRACT
Various strategies employed in genomic DNA cloning by subtractive hybridization have been examined by computer simulations, with the comparison between the predictions and the published results. The result shows that the efficiency of target sequence enrichment and the sensitivity to experimental conditions depend strongly on the enrichment strategy employed. The strategy selecting only tester/tester after hybridization can be very efficient to enrich targets. For successful target enrichment, however, the strategy requires a highly efficient subtraction method and proper hybridization conditions. The strategy also requires that the selected DNA be amplified by polymerase chain reaction (PCR) after each or each alternate subtraction. By contrast, the strategy selecting tester/tester plus single-stranded tester is less sensitive to various experimental factors, compared with the strategy selecting only tester/tester. However, it is not as efficient. With this strategy, the tester DNA selected may or may not be amplified by PCR before the next round. In the case of the strategy selecting single-stranded tester, the target DNA can be successfully enriched only when the selected DNA is directly used without PCR amplification in the next round. The strong features of existing methods can be combined to develop a protocol that is more efficient and more reliable.
Subtractive hybridization has been successfully applied to clone mRNA sequences that are more abundant in one mRNA population than in another, or to clone a DNA that is deleted in a mutant genome (1). In subtractive hybridization, the target sequence in one nucleic acid population (`tester') is enriched by hybridizing with an excess amount of another population (`driver') containing either less or no target sequence, and then by removing the common sequences from the tester. Milner et al. (2) used a kinetic model for subtractive hybridization to simulate the outcome of cDNA or genomic subtraction. They tested the model by comparing its predictions with the published results, and reported that the predictions were generally in good agreement with the published data. However, they also noted that there are discrepancies between the predictions and the experimental results in some cases, particularly when target sequence is of low abundance. The apparent discrepancies might have arisen because the efficiency of the subtractive enrichment procedure was not taken into consideration in the kinetic model.
Removal of common sequence is achieved by various strategies. Commonly used protocols employ biotinylated driver DNA. In this protocol, the tester/driver hybrids are removed using streptavidin or avidin (3-6). The hybrids are also removed by various other means, including hydroxyapatite chromatography (7-9), chemical cross-linking (10) or enzymatic degradation (11). In some cDNA subtractions, cDNA covalently linked to a latex particle is used as driver, and the hybrid is removed by simple centrifugation (12,13). All of the enrichment strategies are designed to remove the tester hybridized with driver, however, the removal is often incomplete for various reasons (1). If the tester/driver hybrid is not completely removed, it can seriously affect the target enrichment. The efficiency of subtractive hybridization is also expected to be influenced by other key factors, such as driver:tester ratio, DNA concentration and salt concentration. The efficiency depends also on the enrichment strategy employed. Although it is evident that the efficiency of subtractive hybridization is varied depending on a variety of factors, it has not been thoroughly examined exactly how much the efficiency of target enrichment is influenced by each factor.
In this study, simulations have been carried out using a model genome system, in an attempt to examine various hybridization conditions and enrichment strategies employed in subtractive enrichment. Simulations were also performed for the genomic subtraction experiments done by Lisitsyn et al. (14), Wieland et al. (7), Straus and Ausubel (5) and Sun et al. (6) to compare the predictions with the published results. In the simulations, the efficiencies of subtraction process and biotinylation reaction were incorporated as input variables to accommodate various experimental situations. Simulation of genomic subtraction has the advantage that more thorough examination can be done since the information on a genome, such as genomic size, abundance class data and the size of target DNA, is in general available.
When a DNA sample is composed of several abundance classes, the absolute concentration of each class depends on the fraction of the class among total DNA. Assuming that DNA reassociation follows the ideal second-order kinetics (15), the t1/2, the time when reassociation is half complete, of a particular class i is determined by the equation:
1
where Coi is the concentration of a particular class i in moles of nucleotides per liter. The rate constant ki can be calculated by the following equation formulated by Milner et al. (2):
2
where Z is the enhancement factor that depends on salt concentration (15), Gi is the complexity of a specific class i in bp and L is the length of the shorter of the two reacting complementary strands in bp. The rate constant is proportional to the square root of L (16).
The fraction of single-stranded DNA (Fss) for a class after hybridization time t can be calculated by the equation:
3
Since kiCoi = 1 / {{{italic t} sup i} sub {1 / 2}} (from equation 1),
4
The amount of single-stranded tester DNA of class i (ssTi) after time t can then be calculated by the equation:
5
where Testeri is the absolute amount of class i tester DNA.
The fraction of double-stranded DNA is (1 - Fss). The double-stranded tester after hybridization would be in one of two forms: tester/tester (TT) and tester/driver (TD). The amount of TT of class i is calculated by the equation:
6
where Driveri is the absolute amount of class i driver DNA. Likewise, the amount of tester DNA of class i in TD form is calculated by the equation:
7
The amount of single-stranded driver or double-stranded driver/driver can be calculated in a similar way. The amount of driver in TD form is assumed to be equal to the amount of tester in the TD.
Using the information (L, hybridization time t, fraction of each abundance class, number of different DNA species in a class, DNA concentration, and salt concentration) provided by a user, the computer program calculates t1/2 of each abundance class using equations 1 and 2. The computer also calculates the amount of single-stranded or double-stranded tester of the particular class after time t of hybridization using equations 4-7. Then, depending on the selection strategy and the efficiencies of subtraction and tester DNA recovery, the computer calculates the amount of selected DNA for a given class. After each round of subtractive hybridization, the fraction of a specific class among total DNA is again calculated, and used in the next round. The calculations are reiterated as long as the user wants to continue. The simulations were run using QBasic (MS DOS).
The model genome used for simulation is shown in Table 1. It was assumed that there are three abundance classes, since there are typically three reassociating components (slow, intermediate and fast components) in the eukaryote genome (17). Class 1 corresponds to single-copy sequences. Classes 2 and 3 represent repetitive DNAs whose copy numbers are 100 and 5000, respectively. Class 4 represents the target sequence, which is a single-copy sequence of 5 kb and present only in tester. The genomic size of the model system is 2.5 × 109 bp. The abundance class data on human and Arabidopsis thaliana genomes are the same as used by Milner et al. (2), except that the snapback/highly repetitive sequences in the human genome are included in the fast reassociating component. It was assumed that the genome of yeast is 2 × 107 bp in size with no repetitive sequences. In the simulation, the DNA concentrations of tester and driver DNA were 0.1 and 10 mg/ml, respectively, unless stated otherwise. The standard Z value was 7, which is equivalent to 1.0 M Na+, and the standard hybridization time was 20 h.
aClasses 1, 2 and 3 represent common sequences. The target sequence represented by class 4, which is 5 kb in size, is single-copy and present only in tester. bAverage length of the DNA is 250 bp.
The subtractive hybridization procedure for modeling is depicted in Figure 1. After hybridization, the sample DNA would be in one of five different forms: single-stranded tester (ssT), double-stranded tester/tester (TT), double-stranded tester/driver (TD), single-stranded driver (ssD), and double-stranded driver/driver (DD). Based on the types of the selected tester DNA after subtraction, existing methods can be categorized into three selection strategies. One of the frequently used methods to remove common sequence in tester is that the driver DNA is biotin-labeled and the hybrid tester (TD) is removed using streptavidin (3,4). In this case, both ssT and TT are selected: therefore, this procedure is categorized as `ssT+TT selection' strategy. In many cDNA subtractions, single-stranded tester is hybridized with single-stranded driver. In these protocols, first strand cDNA tester is usually hybridized with excess driver mRNA or cDNA. Alternatively, single-stranded phagemids with directional inserts are used to provide driver and tester (18,19). These methods using single-stranded tester and driver are basically the same as `ssT+TT selection' in that kinetically unhybridized common sequences are not removed.
In this strategy, the tester DNA is amplified by PCR after each subtraction, and then used as a new tester in the subsequent round. This enrichment strategy is here termed PCR amplification after each subtraction (PAAS). The simulation of the subtractive enrichment employing PAAS was performed with the model genome. The results are shown in Figure 2. In this simulation, the efficiency of TD removal was assumed to be 99%. It was also assumed that 99% of TT and/or ssT was recovered in each subtraction round. The experimental conditions are kept the same throughout the subtraction rounds as in the initial conditions. The target sequence is enriched to >90% of total DNA after round 9 when ssT+TT selection strategy was employed (Fig. 2A). When TT selection strategy was employed, similar enrichment level can be achieved after round 6 (Fig. 2B). In the case of ssT selection, the target sequence is enriched ~100 times after round 5, then little enrichment is achieved afterwards (Fig. 2C). The maximum enrichment level achieved with the ssT selection strategy is largely determined by driver:tester (D:T) ratio: when D:T ratio is 1000:1, then the target sequence is enriched up to 1000 times (data not shown). The simulation shows that TT selection strategy is far better than the other two selection strategies. However, as will be described shortly, the TT selection strategy is very sensitive to subtraction efficiency. Furthermore, presence of unbiotinylated driver DNA, where applicable, can greatly affect the target enrichment.
The outcome of subtractive hybridization can be very different depending on various experimental factors, as exemplified by the experiment of Lisitsyn et al. (14). Simulation was performed to further examine the effects of TD removal efficiency, D:T ratio, DNA concentration and salt concentration on target enrichment. Since biotinylated DNA is frequently used, the impact of unbiotinylated DNA was also investigated. The simulation was carried out using the model genome for the subtractive hybridization employing the PAAS strategy. The ssT selection strategy is not considered here, since the strategy in PAAS is inappropriate.
Figure 3 shows the result of the simulation where percent TD removal efficiency was varied. The efficiencies of ssT removal and TT recovery were all assumed to be 99%. In TT selection strategy, it is shown that the target sequence enrichment is greatly affected by the efficiency of TD removal. It also shows that there is a situation where enrichment is not achieved. In the model genomic subtraction with TT selection strategy, the target sequence is not enriched but decreases as subtraction rounds proceed when the efficiency of TD removal is 94% or less. In the case of ssT+TT selection strategy, the efficiency of target sequence enrichment is not considerably affected by the low TD removal efficiency. The impact of subtraction efficiency also depends on the abundance of target sequence. Suppose that the target sequence of the model genome constitutes 0.002% of total tester (10 times more abundant compared with the model genomic system). Then the target sequence is enriched by the TT selection strategy even if the TD removal is less efficient: the target sequence is enriched to >90% after seven rounds even when TD removal efficiency is as low as 90% (data not shown). Previously, genomic sampling was shown to be a key to the success of target sequence enrichment in the experiment of Lisitsyn et al. The genomic sampling procedure in effect resulted in increase of the abundance of a particular target sequence in accordance with the reduction of complexity.
The tester DNA can be amplified after each alternate subtraction. Although this strategy has been employed in cDNA subtractions (11,22), it can also be applied to genomic subtraction. This strategy is termed as PCR amplification after consecutive subtraction (PAACS). Figure 6 shows the result of the simulation of the subtractive enrichment employing this strategy. Under standard conditions of subtractive hybridization with PCR amplification of tester DNA after every two cycles of subtraction, the target sequences are enriched to >90% after five rounds (10 subtraction cycles) when ssT+TT selection strategy is employed (Fig. 6A). This result is similar to that obtained with PAAS. The effects of TD removal efficiency, D:T ratio, DNA and salt concentrations, and unbiotinylated driver are also similar as in the case of PAAS. One major difference could be that the PAACS strategy is relatively insensitive to the D:T ratio. Similar levels of enrichment (>90%) can be obtained after five rounds at the D:T ratios ranging from 10:1 to 10 000:1 (data not shown).
Figure 6.Simulation of the subtractive hybridization with PCR amplification after each alternate subtraction. In this simulation, the tester DNA selected is amplified by PCR after two cycles of consecutive subtraction, and then used in the next round of subtractive hybridization after adjustment to the initial concentration. Thus, one round involves two cycles of subtraction. The other experimental conditions and assumptions are the same as described in Figure 2. The outcomes of simulation employing ssT+TT (A) and TT (B) selection strategies are represented.
In PAACS strategy, target sequence can also be enriched by TT selection, as shown in Figure 6B. In the TT selection strategy, an optimum D:T ratio exists as in the case of PAAS. However, the dependency is even greater. Furthermore, this strategy is highly sensitive to TD removal efficiency and presence of unbiotinylated driver (data not shown). Previously, the TT selection strategy in PAAS was shown to be more sensitive to TD removal efficiency and presence of unbiotinylated DNA at higher D:T ratios. In PAACS, the D:T ratio becomes higher in the second subtraction cycle, since a significant amount of tester DNA is removed in the first cycle. Thus, it is not surprising that the PAACS procedure with TT selection is more sensitive to TD removal efficiency and presence of unbiotinylated driver, compared with the PAAS procedure. It is then predicted that the TT selection in the PAACS strategy would be more efficient and more reliable at higher tester concentration. Indeed, when the tester concentration is increased to 1 mg/ml (10 times higher than under standard conditions), the target DNA is more efficiently enriched: only three rounds are needed to enrich the targets to >90%. Moreover, at this higher tester DNA concentration, the enrichment procedure is less sensitive to the subtraction efficiency and presence of unbiotinylated driver (data not shown).
In the ssT selection strategy, the target sequences are enriched to 0.074% (~370-fold increase) after the third round, and little enrichment is achieved afterwards. This result is similar to that obtained with PAAS, although the efficiency of target sequence enrichment is a little higher. Again, the outcome depends strongly on the D:T ratio.
In a strategy where subtractions are performed without intermittent PCR amplification, the remaining DNA after a subtraction round is directly used in the next round. This strategy is known as consecutive subtraction (CS). In this case, PCR amplification of tester DNA is performed only after a final subtraction round. In genomic subtraction, this CS strategy was employed by Wieland et al. (7), Straus and Ausubel (5) and Sun et al. (6).
In PAAS, the initial D:T ratio can be maintained throughout the subtractive hybridization rounds. In CS strategy, the D:T ratio inevitably changes. Simulation was performed using the model genome for subtractive hybridization employing the CS strategy. The results are shown in Figure 7. The simulation was carried out with the assumption that the subtraction efficiency was 99% as in the case of PAAS. When ssT+TT selection strategy is employed, the target DNA can be enriched to >90% after nine rounds of subtraction (see Fig. 7A), which is comparable with the result obtained by the PAAS strategy shown in Figure 2. Different results were obtained in the case of TT or ssT selection strategy. In TT selection strategy, the target sequences are not enriched. This is because the target DNA is in low concentration and only a small fraction of the target DNA can self-anneal to form TT. In subsequent rounds, the target DNA will have less and less chance to form TT, consequently it fades away as subtraction rounds proceed. The results with ssT selection strategy also show a sharp contrast to the results obtained with PAAS strategy. The target DNA can be enriched to >90% after eight rounds of subtraction, as shown in Figure 7B.
Figure 7.Simulation of the subtractive hybridization without PCR amplification after subtraction. In this simulation, the remaining DNA from a subtraction round is directly used in the next round of subtractive hybridization without PCR amplification. Thus, the driver:tester ratio increases as subtraction rounds proceed. The other experimental conditions and assumptions are the same as described in Figure 2. The outcomes of simulation employing ssT+TT (A) and ssT (B) selection strategies are represented.
In CS strategy, the unbiotinylated driver, if any, is added in each subtraction cycle. Should the unbiotinylated driver accumulate, this will lead to a decrease in enrichment efficiency. Simulation shows, however, that the additive accumulation of unbiotinylated DNA does not occur. Suppose that 10% of driver is unbiotinylated at the single-stranded DNA level. The situation was simulated for the model genome with the assumption of complete removal of biotinylated DNA, that is, 100% subtraction efficiency and 100% recovery. The result shows that, after the first cycle of subtractive hybridization and addition of fresh driver, the fraction of unbiotinylated DNA increases from 10% to 12.2% (class 1), 11.0% (class 2) and 11.0% (class 3). After the second cycle, the fraction increases to 12.8% (class 1), 11.1% (class 2) and 11.1% (class 3). However, the fraction of unbiotinylated DNA does not increase very much afterwards, that is, the system has reached a steady state where input and output of unbiotinylated DNA are nearly equal. Similar results were obtained when it was assumed that biotinylated driver DNA was not completely removed (data not shown). Nevertheless, the increased proportion of unbiotinylated DNA reduces the efficiency of target enrichment.
That 10% of driver is unbiotinylated does not mean that 10% of the driver accumulates in each subtraction round. If the unbiotinylated DNAs were in single-stranded form, they would not be removed by streptavidin. If the unbiotinylated DNA forms driver/driver double-stranded DNA with biotinylated DNA, however, streptavidin can capture the DNA. Under this circumstance, only 1% of total DD is unbiotinylated on both strands. Therefore, only a fraction of unbiotinylated driver DNA remains in the subtracted DNA. Even for the unremoved unbiotinylated DNA, there is a chance of forming `biotinylated' double-stranded DNA in the next round of hybridization.
Wieland et al. (7) employed a ssT selection strategy using biotinylated tester DNA to enrich target DNA in human genome. In the experiment, the double-stranded DNA including TD was removed by hydroxyapatite chromatography. After three rounds of subtraction, they obtained between 100- and 700-fold enrichment of target sequences. Assuming 99% of subtraction and recovery efficiencies, the simulation of the experiment using bacteriophage [lambda]DNA as a target shows that the target DNA is enriched by ~400-fold after three rounds, generally consistent with the published result. The simulation also shows that the target DNA can be enriched to >90% if seven rounds of subtraction are performed. In this case, the presence of unbiotinylated DNA has negligible effect, since the TD removal procedure using hydroxyapatite chromatography has nothing to do with biotinylated DNA. The only effect of unbiotinylated tester is that the yield of the target DNA after the final selection step using avidin can be lower. However, the low recovery would not be a big problem, considering that PCR is performed after the avidin selection. Thus, the enrichment efficiency is mainly determined by the efficiency of the method separating single-stranded and double-stranded DNA. If the efficiency of the subtraction using hydroxyapatite is 90%, nine subtraction rounds are required to achieve target DNA enrichment to >90%. If the subtraction efficiency drops to 80%, 11 rounds are needed (data not shown). Therefore, it is important to remove as much double-stranded DNA as possible, even if some single-stranded DNA is lost during the process.
Straus and Ausubel (5) employed a CS strategy to clone a DNA corresponding to a deletion mutation in yeast. After repeated rounds of subtraction using biotinylated driver (ssT+TT selection), only the tester/tester DNA was selected in a final step. When their experiment was simulated, the target DNA was shown to be enriched to 46% and 99% after the second and the third rounds, respectively, assuming that the overall efficiency of the procedure is 99%, i.e., 99% efficiency of subtraction, recovery and biotinylation. The prediction based on the assumption of 99% efficiency is not consistent with the experimental results, indicating some inefficiency in the subtractive enrichment procedure. When assuming that TD removal efficiency is 90% and that unbiotinylated single-stranded DNA was 10% of total driver DNA, the target sequences would be enriched to 8.6% after the second and 70.8% after the third round, which might reflect the experimental results. It should be mentioned here that the target DNA was significantly enriched by the final TT selection procedure. Without the TT selection, the target sequences constitute only 0.5% and 2.5%, respectively, after the second and third rounds.
Figure 8. Target DNA enrichment at each subtraction round. The values on the y-axis represent the fold-increase of the proportion of the target DNA after each subtraction round. The fold-increase was calculated from the simulation of the model genomic subtraction with 99% subtraction efficiency and 1% unbiotinylated DNA (at single-stranded level). The target sequences are enriched to >90% after seven rounds of subtractive hybridization in the case of TT selection, and after nine rounds in the case of ssT+TT selection. Therefore, little enrichment of target DNA is achieved after the seventh or the ninth round.
Sun et al. (6) extended the method of Straus and Ausubel to clone a deleted DNA in Arabidopsis. In the subtractive cloning of a 5 kb target in Arabidopsis, they found that the target sequences were enriched to only ~5% after five cycles. Moreover, little enrichment occurred after two additional subtraction cycles. When the experiment was simulated with 99% subtraction efficiency, the target sequences were found to be enriched to >90% after three cycles. The discrepancy can be explained when assuming that the overall efficiency is far lower than 99%. However, even in this case, the target fraction would increase considerably by the two additional subtraction cycles. This rather perplexing result can be partly explained by contaminant in tester, whose presence was examined and confirmed by the authors. The situation was simulated with the assumption that contaminating DNA of 5 × 106 bp in complexity constitutes 0.5% of tester DNA. Under this circumstance, the target DNA was shown to be enriched to 4.7% with the contaminating DNA being enriched to 95.3% after five cycles, if the subtraction efficiency was 90% and if 10% of single-stranded driver DNA were unbiotinylated. In this case, the target DNA fraction does not change significantly after two additional subtraction cycles.
The outcome of subtractive hybridization is greatly influenced by the genomic size or the abundance of target sequence. Reduction of complexity, hence increase of target sequence abundance, can be accomplished by genomic sampling, as manifested by Lisitsyn et al. (14). Once the abundance of target sequence is increased, the enrichment procedure becomes more efficient and less sensitive to subtraction efficiency and presence of unbiotinylated DNA. Another strategy to make the enrichment procedure more efficient and more reliable might be employing a combination of selection strategies, as used by Straus and Ausubel (5) and Sun et al. (6). Figure 8 shows the enrichment achieved either by ssT+TT or by TT selection strategy at each subtraction round in the model genomic subtraction employing PAAS, with 99% subtraction efficiency and 1% unbiotinylated driver DNA. It shows that ssT+TT selection is more efficient at the first two subtraction rounds. From the third round, however, TT selection is far better than the ssT+TT selection. This is because the target DNA is sufficiently enriched after two rounds of subtraction and PCR, so that kinetic enrichment can be greatly enhanced.
Considering this data, it would be better to perform initial subtractions by ssT+TT selection strategy, and then perform subtractions by TT selection strategy. Assuming that the subtraction efficiency is 99%, and unbiotinylated driver is 5% (at single-stranded DNA level), the target sequence in the model genome is enriched to >90% after 10 rounds by ssT+TT selection alone in PAAS. When TT selection strategy is employed, the target DNA is not enriched. However, if ssT+TT selection strategy is employed in the first two rounds of subtraction, the enrichment of target DNA to >90% can be achieved by five rounds of TT selection (seven rounds in total).
The same strategy can be applied to PAACS, resulting in enhancement of target sequence enrichment. In PAACS with PCR amplification after every two cycles of subtraction, the target sequence in the model genome is enriched to >90% after five rounds (10 subtraction cycles) of ssT+TT selection, assuming that the subtraction efficiency is 99%, and unbiotinylated single-stranded DNA is 5%. TT selection gives no enrichment. When TT selection strategy is employed after one round of ssT+TT selection and PCR amplification, however, the target sequence is enriched to >90% after three rounds (four rounds in total). The strategy can also be applied to CS protocols. However, in this case, the strategy may result in initial increase and then decrease of target DNA fraction. For example, when the subtraction is performed with TT selection after two rounds of ssT+TT selection, then the target sequence is enriched to 35% after five cycles of TT selection, and decreases thereafter.
Successful genomic subtraction requires effective removal of tester/driver hybrids and, where applicable, efficient biotinylation. DNA and salt concentrations must also be high enough to ensure good hybridization. However, precautions should be taken since hybridization may be hampered at high salt concentration due to high Tm (3), and hybridization rate may be reduced by increased viscosity at high DNA concentration (15).
Successful target sequence enrichment also depends on the enrichment strategy employed. The simulation showed that the ssT+TT selection strategy can be widely used in the experiments employing PAAS, PAACS and CS strategies and that it is relatively less sensitive to various experimental factors, though it is not as efficient as the TT selection strategy. The TT selection strategy can be highly efficient, however, it is very sensitive to various experimental conditions. With TT selection strategy, only a fraction of target DNA would be saved. In contrast, almost all of the target DNA can be saved in the ssT+TT selection protocol. Therefore, with the same subtraction efficiency (same amount of unremoved TD), the proportion of the non-target DNA in the subtracted DNA is far greater when only TT is selected. This explains why TT selection strategy is so sensitive to subtraction efficiency. In TT selection, the efficiency of ssT removal is also an important factor. Previously, it was shown that the target enrichment does not occur with 94% TD removal efficiency in the model genomic subtraction employing TT selection (Fig. 3). In the simulation, the efficiency of ssT removal was assumed to be 99%. Interestingly, the target can be enriched to >90% after nine rounds if the subtraction efficiencies for TD and ssT are all 94%. This shows that an effort to remove ssT as much as possible without ensuring high TD removal efficiency can have a detrimental effect on target enrichment.
The protocol employing ssT+TT selection strategy is subtractive in nature, whereas the protocol employing TT selection strategy has a kinetic enrichment component as well. Incorporating kinetic enrichment in subtractive hybridization can greatly accelerate target enrichment, though it is difficult. To maximize kinetic enrichment component, the tester DNA concentration must be deliberately lowered to an appropriate level, as in the experiment of Lisitsyn et al. (14). However, the consequent high D:T ratio can reduce overall efficiency of target DNA enrichment (Fig. 4), as far as the subtraction efficiency is not 100%. The risk of high D:T ratio is also manifested in the TT selection protocol employing CS strategy, where target sequence is not enriched at all.
The TT selection strategy employed by Lisitsyn et al. is elaborately designed to select TT with high efficiency. The only disadvantage would be that the procedure is very complicated. An alternative method for TT selection could be the EDS method developed by Zeng et al. (11). In this protocol, however, removal of unhybridized single-stranded DNA may not be efficient due to the intramolecular or intermolecular annealing between linkers. To remove single-stranded DNA effectively, excess of a primer which has a sequence complementary to the linker at the 5' end of tester DNA can be added just after hybridization. Alternatively, S1 nuclease or mung bean nuclease treatment may be included to ensure that single-stranded DNA is removed to completion. However, the very fact that single-stranded tester DNA is not efficiently removed without appropriate measures can be exploited to develop a more efficient protocol. If necessary, the DNA mixture after hybridization may be slowly cooled to room temperature so that the linkers of single-stranded testers can be annealed with each other. This protocol may be applied in the early rounds of subtractive hybridization to save as much target DNA as possible. In the later rounds, a strict protocol selecting only tester/tester DNA can be used to enhance target sequence enrichment. As mentioned before, this strategy can greatly improve the efficiency and reliability of a subtractive enrichment procedure.
TT selection strategy is effective in eliminating contaminating sequences, provided that the frequency of the contaminating sequence is lower than that of the target sequence. As discussed in the case of the experiment of Sun et al., the contaminating sequence present in tester can limit the target sequence enrichment. If the tester DNA sample of the model genome is contaminated with bacteria of 5 × 106 bp in complexity in the proportion of one microorganism per five sample cells, the 5 kb target and the contaminating DNA constitute 0.0002% and 0.02% of tester, respectively. In ssT+TT selection procedure, both the target and the contaminating sequence are equally saved and amplified. Thus, the 1:100 ratio is maintained throughout the enrichment rounds and the target DNA will never be enriched to >1%. When TT selection strategy is employed, however, the target sequence is successfully enriched to >90% without any noticeable effect from the contaminating sequence. This is because self-reassociation is strongly in favor of the target DNA under the circumstance that the frequency of target sequence is 10 times that of the contaminating sequence.
If tester is contaminated, the above-mentioned strategy using a combination of ssT+TT and TT selection strategies can pose a problem, since the initial ssT+TT selection can also enrich the contaminating sequence to a point where target enrichment may no longer be favored in the TT selection strategy. Yet, it is still possible to eliminate the contaminating sequences after a subtraction experiment, provided that the contaminating organism(s) are identified: another subtraction experiment using the DNA of the contaminating organism (or closely related organism) as driver can eliminate the contaminating DNA from the enriched DNA. If the frequency of contaminating sequence is higher than that of target sequence, the target sequence diminishes as subtraction proceeds since enrichment of the contaminating sequence is more favored. Therefore, it is important that the sample used for preparation of tester be germ-free. One way to alleviate the problem with contaminating sequences could be for the DNA of potential pathogens or common inhabitants to be added to driver DNA in advance.
We would like to thank Dr Nam-Jeong Cho for his helpful comments. This work was partly supported by grants from the Korea Science and Engineering Foundation.