Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (84K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Shields, D.
Right arrow Articles by Whitehead, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Shields, D.
Right arrow Articles by Whitehead, A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 1996 Oxford University Press 4495-4500

Footnote

Mapping genes within a YAC by computer-assisted interpretation of partial restriction digestions

Mapping genes within a YAC by computer-assisted interpretation of partial restriction digestions Denis C. Shields* , Aileen Butler , Kris R. Mosurski 1 , Marie T. Walsh and Alexander S. Whitehead

Department of Genetics and 1 Department of Statistics, Trinity College, Dublin 2, Ireland

Received July 30, 1996; Revised and Accepted October 8, 1996

ABSTRACT

Partial restriction digestion is used to map restriction sites and the location of genes within yeast artificial chromosomes (YACs). Locus-specific probes are hybridised to the partially digested YAC DNA and the fragments to which they hybridise are compared with the pattern of partial digestion products that include each map region. A least squares criterion is presented which allows for error in fragment length determination. This rapidly defines the most likely location of a marker within the restriction map and permits the combination of results from digestions with different restriction enzymes. Approximate confidence intervals may be assigned to gene locations, and tests of goodness-of -fit of the data may be performed. Since the number of erroneously matched fragments increases in proportion to the square of the number of sites, denser maps are not necessarily more informative. Simulations indicate that the optimal number of internal restriction sites given typical experimental error (1% of YAC length) is about five sites; the associated broad support interval (on average one third of YAC length) may be reduced by combining results from different enzyme digestions. Application of a computer implementation of this model to experimental data showed that the model fitted well, and estimates of location were found to be consistent with other evidence.

INTRODUCTION

The physical mapping of large segments of genomic DNA has been greatly facilitated by their manipulation in yeast artificial chromosomes, or YACs ( 1 ). While the long-range and medium-range connectivity is provided by linkage and radiation hybrid maps, and the highest mapping resolution is obtained by sequence determination, it is often of interest to map markers more precisely within a 1000 kb interval. A variety of approaches are possible, including fibre-FISH ( 2 ), analysis of overlapping YACs and restriction mapping within YACs.

Restriction mapping of YACs presents a number of difficulties. Large DNA fragments are much less prone to shearing when protected in agarose, but this inhibits restriction enzyme activity, so that it is rarely possible to obtain a complete digestion without partially digested products. An established method ( 3 - 5 ) which is appropriate for the analysis of an individual YAC is partial restriction mapping using pulsed field gel electrophoresis (PFGE). A given restriction map is constructed by probing partially digested YAC DNA with DNA markers from the ends of the YAC (which contain yeast marker genes from the original YAC vector), thus identifying ladders of fragments whose sizes correspond to the distance of each restriction site from the respective ends of the YAC ( 1 , 6 ). A gene contained within the YAC can then be hybridised to the partial digestion, yielding a characteristic pattern of signals. The position of the gene within the partial restriction map may then be determined by relating the probe hybridisation pattern to the fragments mandated by the partial restriction map. However, the number of potential partial restriction products increases in proportion to the square of the number of restriction sites, making manual comparison of the observed fragments with expectations arduous, and open to subjective interpretation. For this reason, the approach has been mainly used with rare cutting enzymes which only cut a few times in a given YAC. Here, we show how computer-assisted comparison of observed and expected fragment sizes can greatly speed up analysis of restriction data by allowing rapid interpretation of digestions with a number of sites, by combining information from a number of different enzymes, and by assigning approximate confidence intervals.

THEORY AND METHODS

A number of groups have presented elaborate computational approaches to the analysis of partial restriction data. However, they have in common the goal of assigning the marker locations from the fragment sizes generated by hybridisation of the unknown marker on its own, without consideration of the restriction map data provided by the end-markers. This is computationally intense since, if there are r restriction sites, there are ( r - 1)!/2 possible orders (assuming that all completely digested fragments are identified). While simulation has suggested that it may be possible to construct maps in the presence of realistic levels of error and with missing fragments when there are <10 restriction sites ( 7 ), this approach has yet to be applied in practise. Other groups have presented simulations ( 8 ) with 11 sites where the number of orders is reduced to ~ r 2 /4 by hybridising with a probe which is located within the region ( 9 ). Their results indicate that this approach is quite sensitive to error, e.g. a 1% error rate resulted in only 60% correct recovery of the true restriction map ( 8 ). The method discussed here depends on the availability of end-markers to the YAC, which are usually readily available in the form of YAC vector specific sequences contained in the YAC construct. The practical computational problem is then reduced to choosing among the regions within a known restriction map; this has, up to now, been solved manually by investigators.

YAC DNA is partially digested by a restriction enzyme. The fragments are separated by PFGE, transferred to membranes and hybridised with a DNA probe for one end (e.g. the left end) of the YAC. The sizes of the observed fragments are estimated by direct comparison with a size standard. For each enzyme site, a fragment is observed whose length ( L ) corresponds to the length from the left end of the YAC to the position of the restriction site. The same procedure is usually repeated using a right hand end probe. The location of the site determined using the right end probe is placed on a similar scale by simply calculating R as the difference between the total YAC length and the observed fragment size. For each of the r restriction sites there are then two individually generated estimates ( L and R ) of the location of the site within the YAC. PFGE experimental conditions are usually chosen such that distance travelled on the gel is approximately linearly proportional to fragment length. Thus, it is assumed that large and small fragments have a similar error, unlike analysis of standard polyacrylamide gel electrophoresis, where log-transformation of fragment size is typically performed to allow for the dependence of error on fragment size. Assuming that error in measuring fragment size follows a normal distribution, the variance of fragment size determination may be estimated as {sigma sup 2} = sum ( {italic L} ^ - ^ {italic R} {) sup 2} / 2 {italic r} 1

The restriction map is established from the mean locations based on the left and right marker information.

DNA specific for a gene whose location within the established restriction map is unknown is probed against the partial digestion, and hybridises only to the fragments containing that gene. A set of n observed fragments are thus identified by experimentation: it is not necessary, and in some cases may be unlikely, that all fragments are observed.

For each interval within the established map, there is a set of expected fragments which would be associated with a probe which hybridised to that interval: essentially, these are the possible fragments between all possible sites which overlap this interval. A full likelihood comparison between each interval is not desirable for two reasons: first, the number of expected fragments differs between intervals, so that the likelihood is conditioned on a different set of data in each case. Second, if one expected fragment encompassing a particular map region matches the observed fragment well, it is not biologically meaningful to consider the matches of the other expected fragments, since only one is expected to match. For these reasons, an approximate likelihood is calculated by only comparing the observed fragments with the closest expected fragments in each interval. If experimental conditions were such that all expected fragments were observed, it would be of use to take into account the number of observed fragments, which would be greater for central regions. However, as the number of observed fragments is dominated by experimental conditions rather than by the expectations, this information is ignored (leaving aside the question of how this information might best be utilised). The objective is then to determine which set of predicted fragments is closest to the set of experimentally observed fragments, and how well they match each other. The variance of a single observed fragment is taken to be [sigma] 2 . Since the location of a restriction map is typically derived as the mean of two measurements from left and right end probes, its variance is [sigma] 2 /2. Each predicted fragment within a map is calculated as the difference of two restriction sites, and has the variance [sigma] 2 . For each interval, the closest predicted fragment to the observed fragment is taken, and the difference in their sizes ( d i ) is calculated, which has a variance of 2[sigma] 2 . The approximate likelihood of this interval is L = {PI from {{italic i} = 1} to n} exp {{( - {italic d}} sub i sup 2} / 4 {sigma sup 2} )

The log 10 likelihood ratio comparing region a to the region with the highest likelihood ( b ) is: L O G L R = 0 . 2 5 {{( l o g} sub {1 0}} e ) ( {sum from {{italic i} = 1} to italic n} {d sub {i b} sup 2} ^ - ^ {sum from {{italic i} = 1} to italic n} {d sub {i a} sup 2} ) / {sigma sup 2} 2

The choice of log 10 scale facilitates easy understanding of the likelihood differences (e.g. a difference of 2 corresponds to 100:1 odds), and follows the convention estalished for human linkage maps ( 8 ). Where more than one enzyme is used, LOGLR is summed across various enzymes for each map interval. The output for analysis with one enzyme constitutes a LOGLR value of 0 for the best fitting region, and negative values for the other regions.

The confidence interval (CI) of gene location may be approximated as the map regions whose log 10 likelihood is within 1.3 of that of the best fitting region, corresponding to 20:1 odds against the gene being found outside this interval. This is a heuristic cut-off established during the analysis of data (see below). Frequently, the support interval may include a number of non-continuous regions within the map, but for clarity a single interval encompassing such intervals is reported. The confidence interval is sensitive to the estimation of measurement error: if the error is over-estimated, the interval will be too large, if it is under-estimated, it will be more narrow than the experimental data can support. For this reason, it is important to consider the goodness-of-fit.

The goodness-of-fit of the observed data to the predicted interval is calculated as {chi sub n sup 2} = {sum from {i = 1} to italic n} {d sub i sup 2} / 2 {sigma sup 2} 3

which may interpreted as a chi-square with n degrees of freedom. If the fit of the best region is significantly poor, then the measurement error may be under specified, or there may be specific errors in the data.

Where more than one enzyme is used, LOGLR is summed across various enzymes for each map interval, and the confidence interval for the location of a marker is again estimated as the map regions whose summed LOGLR is within 1.3 of the best fitting region. A test of heterogeneity among enzymes may be carried out to detect whether they differ significantly in their favoured location of the marker, as {chi sub {e - 1} sup 2} {sum from {{italic j} = 1} to italic e} {sum from {{italic i} = 1} to italic n} {d sub {i j y} sup 2} / 2 {sigma sup 2} - {d sub {i j x} sup 2} / 2 {sigma sup 2} 4

where e is the number of enzymes, x is the location favoured by the particular enzyme j , y is the location favoured by the combined enzymes and n is the number of observed fragments for the marker for enzyme j . There are e - 1 degrees of freedom. Again, significant heterogeneity may indicate that the measurement error is underestimated, or that there are specific errors or omissions in the data. When there is a poor fit, the possible source of the error can be investigated by seeing which combinations of markers and enzymes make the largest contribution to the chi-square.

If the number of observed fragments is large, then bootstrap analysis ( 11 ) can provide more robust estimates of the confidence interval, as the difference between observed and predicted fragments may be considered to be approximately independent (bootstrapping establishes the confidence interval by analysing many re-sampled subsets of the data). However, when there are only a few fragments which are critically informative, then the bootstrap results in excessively large confidence intervals. For this reason, we chose to calculate the CI using the likelihood approach, when interpreted in conjunction with the information on the goodness-of-fit of the data.

As the number of predicted fragments is greater in the centre of a restriction map than at the two ends, the likelihood of fragments matching by chance is greater in the centre. Assuming that expected fragments have a uniform size distribution (in fact there is an excess of intermediate sized fragments expected for a region in the centre of the map), the probability that one or more expected fragments in a region is by chance closer to the observed fragment than the expected value of the error of the observed fragment ([sigma]) is 1 ^ - ^ ( 1 ^ - ^ 2 sigma {{/ L )} sup p} 5

where L is the total YAC length and p is the number of predicted fragments. When there are twenty sites, there are 19 expected fragments for an end interval, and 100 for the central interval. If the standard error is 1% of YAC length, according to formula 5 the central region is, by chance, 2.7 times more likely to match an observed fragment than the end interval.

While the confidence interval around each marker location is probably sufficient information for most purposes, it is possible to infer the significance of the relative ordering of two genes mapped to the same YAC which lie within different intervals. The probability that they both lie between the same two restriction sites may be calculated by finding the region with the minimum summed log-likelihood: the magnitude of this log-likelihood represents the evidence in favour of the gene order suggested by their most likely positions.

The statistical approach should allow certain data which would be too difficult to analyse manually to be used in producing reliable maps. However, there is a limit to the number of restriction sites that may cut in a particular experiment, as the number of predicted fragments in the centre of the map is proportional to the square of the number of sites, so that even when measurement error is minimised, the chance matching of predicted fragments will increasingly obscure true matches. Simulations were performed to determine the density of restriction maps that are most likely to be informative in defining marker location. The number of restriction sites ranged between 2 and 30 internal sites, and the experimental error between 0.1 and 3% of total YAC length. A map of 1000 kb was simulated 500 times for each condition, and a separate set of simulations was carried out for each internal interval within the map. Restriction sites were randomly selected, and 10 observed fragments were randomly drawn from each interval (with replacement, so that there were in some cases fewer observed fragments, depending on the number of predicted fragments). Error was then drawn from a normal distribution, and added to the observed fragments, and also to the restriction sites.

The experimental data analysed comprised previously published data mapping the CRP , H4F2 and IFI-16 genes within the human YAC 28A,B5 ( 5 ), as well as newly generated data using frequent cutting enzymes which could only be effectively analysed with computer assistance. This data was generated for the mouse YAC KB8 , which was isolated from the combined ICRF and St Mary's YAC libraries ( 12 ), and determined by polymerase chain reaction (PCR) to contain all five members of the mouse Saa gene cluster ( 12 , 13 ). Probes specific for the mouse Saa1 , Saa2 , Saa3 , Saa4 and Saa5 genes were generated by PCR across sequence-specific regions of these genes ( 14 , 15 ), and partial restriction mapping was carried out following the previously described experimental approach ( 5 , 16 ).

Statistics were calculated with the assistance of a computer program, which is available free of charge for non-profit use (see world-wide-web site http://biotech.bio.tcd.ie/partial.html).


Figure 1 . Mean 95% confidence interval over all potential map intervals (taking the mean of 500 simulations per interval), with varying numbers of internal restriction sites and experimental error (see text). A lower support interval indicates a higher resolution.

RESULTS AND DISCUSSION

Simulations were performed to assess the performance of the statistical model, and its sensitivity to the numbers of restriction sites and the measurement error. The mean 95% confidence interval (CI) over all intervals and over all 500 simulations for each interval are presented for each point in Figure 1 . A lower CI indicates a higher resolution in defining the location of the marker. While increased number of sites reduce the CI at low error, at typical experimental error (1% of YAC length) the optimum number of internal sites is around four or five. If that experimental error is halved (to 0.5%), four sites are still approximately as efficient as 10. Thus, optimum results are obtained with rare cutters in the presence of realistic experimental error, where a single enzyme will define the marker's location within one fifth of YAC length. In spite of this, useful information is still provided by more frequently cutting enzymes. Even 20 internal sites will, on average, limit the CI to three-quarters of the YAC length (assuming 1% error). This information, in combination with results from other enzymes, may be of some value in defining gene location.

The value of the 95% CI as a measure of location support was also assessed by these simulations. The number of times a marker was located outside its CI increased as the number of sites and the measurement error increased: these erroneously placed markers were mainly those whose correct location was at the ends of the YAC. However, the number of incorrect assignments was still relatively few: when the marker was simulated at the end of the YAC with 1% measurement error, the program correctly recovered the marker within the CI 88% of the time when there were 10 internal restriction sites, and 78% when there were 20 sites. However, only 6% of markers fell outside the CI when they were simulated from the intervals next to those at the end of the map, in a map with 20 internal sites. On average, when the true location was simulated as being derived from each of the 21 intervals in turn, only 3% of simulated markers were placed outside the CI. Thus, the 95% criterion for the confidence interval is generally well justified by these simulations.

A straightforward example of the experimental application of this analytical technique is illustrated with previously published data (Walsh et al., 1996). The positions of the restriction sites for the enzymes Xho I and Sfi I (Table 1 ; Fig. 2 ) were inferred by probing with left and right end probes to the human YAC 28A,B5 which was known to contain the genes for CRP , H3F2 and IFI-16 . The standard error of fragment size was estimated according to formula 1 as 2.5, which is ~1% of the YAC length (360 kb). The fragments for the CRP gene (Table 1 ) are illustrated in Figure 2 in alignment with the nearest predicted fragments for the best fitting interval. The best fitting interval and the associated 95% support interval for CRP is illustrated along with those for the H3F2 and IFI-16 markers. The fragment sizes were consistent with the predicted intervals by the goodness-of-fit test, which found no significant deviation between the observed fragment sizes and those predicted for the supported interval, indicating that the measurement error is not underestimated, and that there are unlikely to be erroneous fragment sizes in the dataset (Fig. 2 ). The generation of these results was rapid and did not require any special manipulation of the data, and the results are in good agreement with the manual interpretation ( 5 ). The reporting of confidence intervals is especially useful, as it communicates clearly to what extent the results may be relied upon. Mapping databases such as the Genome database ( 17 ) can include confidence intervals in assignments of gene location.


Figure 2 . Assignment of location and 95% confidence intervals for genes within the human YAC containing the CRP gene. ( a ) Xho I digested fragments identified by the CRP probe, aligned beside predicted fragments fro the most likely interval. The fragment length is given along with the predicted fragment length in parentheses. ( b ) Sfi I digested fragments identified by the CRP probe, ( c ) restriction map derived from data generated by probing with end markers (Table 1). ( d ) Most likely interval (continuous line) with confidence interval (dashed line).

Table 1 . Mapping data for the human YAC 28A,B5 containing the CRP gene
Enzyme

Probe

Observed fragment sizes (kb)

XhoI

left

(-)

25

69

154

178

215

239

263

287

360

Xho I

right

0

30

73

157

182

218

243

266

279

(-)

Xho I

CRP

73

97

121

360

( 77,

96,

119,

360)

Xho I

IFI-16

85

360

Xho I

H4F2

30

73

157

182

218

243

266

279

360

Sfi I

left

(-)

49

65

329

360

Sfi I

right

0

45

65

323

(-)

Sfi I

CRP

260

295

329

360

(261,

295,

326,

360)

Sfi I

IFI-16

260

295

329

360

Sfi I

H4F2

49

65

329

360

Fragment lengths observed from probing with the right-end marker have been subtracted from the YAC length (360 kb) to make them comparable with the left-end fragment lengths. For illustration, the expected fragment lengths nearest to the observed for the best fitting region (Fig. 2) are given in parentheses for CRP .

A more complex example is illustrated by the mapping of mouse SAA genes within a 600 kb YAC using more frequent cutting enzymes. Rare cutting enzymes failed to cut this YAC, and the three enzymes for which data was generated cut the YAC between 15 and 19 times. The number of predicted partial products is far too great for manual interpretation, and so the analysis is entirely dependent on the computer. The end-labelled fragments used in constructing the map (Table 2 ) were not in every case clearly measurable for both the left and right markers, due to excessive signal from certain nearby lanes. The standard error of measurement was estimated as 8.0 kb according to formula 1 , using only those fragments for which duplicate data were available. This is roughly equivalent to 1% of the total YAC length (600 kb). The results are presented in Table 2 . It can be seen that there is no heterogeneity among enzymes when the error specified from duplicate data is used. Similarly, the goodness-of-fit for each marker was not significantly poor for any marker, considering enzymes together or separately (i.e. the differences between observed fragment sizes and the predicted fragments in the best locations were no greater than that expected by chance). This suggests that the assumptions of the model, such as the approximation that error is constant for all sizes of fragment, are valid. In addition, it indicates that there are no gross errors in the experimental data. Given that the data used to assign restriction sites could not be completely duplicated for both the left and right markers (Table 2 ), we cannot be absolutely sure that the restriction maps include every actual site, but the goodness of fit indicate that it is unlikely that there are any additional sites which were missed in both the left and right arm marker experiments, which subsequently flanked fragments detected in the individual hybridisations. It is likely that different sites for the same enzyme have varying degrees of efficiency of cutting, and it is possible that some sites are cut too rarely to contribute visible fragments to the observed data.

Confidence intervals for the mouse Saa genes indicate that the genes are all broadly assigned to an overlapping region, within which their orders cannot be clearly defined (Table 3 ). The most likely locations indicate that the genes are probably clustered within an 80 kb region, and that Saa1 and Saa2 are more likely to group at the left hand side of this cluster relative to Saa3 , Saa4 and Saa5 . Modest improvements in defining gene order and location may be achieved by adding further information from digestions with other enzymes. However, the strong suggestion that the genes are tightly clustered motivated the use of an alternative method to define gene order. Long range PCR ( 18 ) confirmed that the genes lie within a 45 kb interval, with Saa1 and Saa2 at the left hand end of the YAC (the order is Saa2-Saa1-Saa4-Saa5-Saa3 ).

Table 2 . Mapping data for the mouse YAC KB8 containing the Saa gene cluster
Enzyme

Probe

Observed fragment sizes (kb)

Sfi I

left

(-)

-

36

61

106

140

154

183

214

234

267

340

370

387

426

475

510

560

579

600

Sfi I

right

0

14

29

67

97

137

160

187

217

-

260

335

357

382

413

472

509

560

574

(-)

Sfi I

Saa1

20

45

63

156

200

226

235

256

277

313

347

378

445

600

Sfi I

Saa2

38

63

117

156

195

218

250

600

Sfi I

Saa3

45

73

92

118

137

165

171

195

228

378

445

600

Sfi I

Saa4

30

66

97

128

158

198

214

243

275

305

385

428

600

Sfi I

Saa5

29

61

122

158

194

210

227

243

370

428

510

600

Cla I

left

(-)

39

64

133

183

218

-

312

343

375

400

425

497

530

-

600

Cla I

right

0

-

85

142

185

217

265

300

333

375

400

425

491

527

564

(-)

Cla I

Saa1

70

90

105

145

181

261

341

370

400

445

508

521

550

600

Cla I

Saa2

51

73

103

133

179

218

265

288

322

330

360

385

391

409

520

600

Cla I

Saa3

64

80

103

170

215

225

265

318

355

370

408

600

Cla I

Saa4

60

125

179

231

286

333

365

396

600

Cla I

Saa5

36

70

97

121

129

179

194

213

225

242

268

295

308

341

358

396

421

456

518

600

Eco RI

left

(-)

25

46

73

97

125

146

189

218

236

267

325

-

413

465

495

538

-

-

600

Eco RI

right

0

-

-

-

85

122

154

197

-

230

270

313

357

406

479

503

531

556

565

(-)

Eco RI

Saa1

23

33

89

117

151

600

Eco RI

Saa2

20

33

600

Eco RI

Saa3

20

41

69

94

141

600

Eco RI

Saa4

24

82

138

159

206

238

265

300

326

370

484

568

600

Eco RI

Saa5

41

70

134

165

213

231

253

259

305

325

355

413

466

491

530

600

Fragment lengths observed from probing with the right-end marker have been subtracted from the YAC length (600 kb) to make them comparable with the left-end fragment lengths. - Indicates a fragment which is likely to be present, but which was not observable.

Table 3 . PFGE mapping results for the mouse YAC KB8 , assuming a standard error of fragment size of 8.0 kb (estimated from comparison of left and right marker data)
Marker

Likely

Confidence

Test of heterogeneity among enzymes

interval

interval

Chi-square

d.f.

Significance

Saa1

139-150

39-357

2.1

2

n.s.

Saa2

138-139

39-564

0.5

2

n.s.

Saa3

217-218

39-560

1.4

2

n.s.

Saa4

217-218

46-499

1.9

2

n.s.

Saa5

193-216

150-400

1.8

2

n.s.

Likely interval: region with the highest LOGLR, summed over three enzymes. 95% confidence interval: regions excluded to the left and right of this interval had a lower likelihood, with a LOGLR difference of >= 1.3 compared with that of the best interval. Chi-square: the heterogeneity chi-square given in formula 4 .

The 19 restriction sites in the Saa data probably represent the practical limit to the number of sites that are likely to yield useful information given 1% experimental error, as indicated by the broad range of the confidence intervals for the experimental data and for simulated data with similar numbers of sites. Using fewer sites has the clear advantage that the computer derived results may be, in part, checked by hand. However, it is not always easy to predict in advance the frequency of enzyme cutting prior to carrying out the experiments, and even when there are only a few enzyme sites, computer analysis should be quicker and more reliable than manual analysis in combining the results of restriction digestions with different enzymes.

ACKNOWLEDGEMENTS

This work was supported by grants from the Wellcome Trust 039618 and 034345. A.B. was supported by a project grant from the Health Research Board of Ireland; M.T.W. was supported by a Department of Education (Northern Ireland) research studentship and a FORBAIRT (Ireland) studentship. We thank an anonymous referee for suggestions which improved the method.

REFERENCES

1 Burke,D.T., Carle,G.F. and Olson,M.V. (1987) Science, 236, 806-812. MEDLINE Abstract

2 Rosenberg,C., Florijn,R.J., Van de Rijke,F.M., Blonden,L.A., Raap,T.K., Van Ommen,G.J. and Den Dunnen,J.T. (1995) Nature Genet., 10, 477-479. MEDLINE Abstract

3 Nicklin,M.J., Weith,A. and Duff,G.W. (1994) Genomics, 19, 382-384. MEDLINE Abstract

4 Sellar,G.C., Oghene,K., Boyle,S., Bickmore,W.A. and Whitehead,A.S. (1994) Genomics, 23, 492-495. MEDLINE Abstract

5 Walsh,M.-T., Divane,A. and Whitehead,A.S. (1996) Immunogenetics, 44, 62-69. MEDLINE Abstract

6 Smith,H.O. and Birnstiel,M.L. (1976) Nucleic Acids Res., 3, 2387-2398. MEDLINE Abstract

7 Skiena,S.S. and Sundaram,G. (1994) Bull. Math. Biol., 56, 275-294. MEDLINE Abstract

8 Karp,R.M. and Newberg,L.A. (1995) Comput. Applic. BioSci., 11, 229-235.

9 Newberg,L.A. and Naor,D. (1993) Adv. Appl. Math., 14, 172-183.

10 Morton,N.E. (1955) Am. J. Hum. Genet., 8, 80-96.

11 Efron,B. (1979) Ann. Stat., 7, 1-10.

12 Brown,S.D.M. (1992) Genomics, 13, 490-492.

13 Butler,A. and Whitehead,A.S. (1995) Immunogenetics, 42, 153-155. MEDLINE Abstract

14 Lowell,C.A., Stearman,R.S. and Morrow,F. (1986) J. Biol. Chem., 261, 8442-8452. MEDLINE Abstract

15 de Beer,M.C., Kindy,M.S., Lane,W.S. and de Beer,F.C. (1994) J. Biol. Chem., 269, 4661-4667. MEDLINE Abstract

16 Mendez,M.J., Scott,P., Erickson,P.B., Drabkin,H.A. and Gemmill,R.M. (1994) In Nelson,D. and Brownstein,B. (eds), YAC Libraries: A User's Guide. W.H.Freeman, New York, pp. 57-91.

17 Fasman,K.H., Letovsky,S.I., Cottingham,R.W. and Kingsbury,D.T. (1996) Nucleic Acids Res., 24, 57-63. MEDLINE Abstract

18 Butler,A. and Whitehead,A.S. (1996) Immunogenetics, in press.


Return

*To whom correspondence should be addressed. Tel: +353 1 608 2390; Fax: +353 1 679 8558; Email: dshields@biotech.bio.tcd.ie
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?



This Article
Right arrow Abstract Freely available
Right arrow Print PDF (84K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Shields, D.
Right arrow Articles by Whitehead, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Shields, D.
Right arrow Articles by Whitehead, A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?