Nucleic Acids Research, 2002, Vol. 30, No. 2 e6
© 2002 Oxford University Press
A new approach to genome mapping and sequencing: slalom libraries
1Microbiology and Tumor Biology Center and 2Center for Genomics Research, Karolinska Institute, 171 77 Stockholm, Sweden, 3Swedish Institute for Infectious Disease Control, 171 82 Solna, Sweden and 4Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 117 984 Moscow, Russia
Received September 11, 2001; Revised and Accepted November 9, 2001.
| ABSTRACT |
|---|
|
|
|---|
We describe here an efficient strategy for simultaneous genome mapping and sequencing. The approach is based on physically oriented, overlapping restriction fragment libraries called slalom libraries. Slalom libraries combine features of general genomic, jumping and linking libraries. Slalom libraries can be adapted to different applications and two main types of slalom libraries are described in detail. This approach was used to map and sequence (with
46% coverage) two human P1-derived artificial chromosome (PAC) clones, each of
100 kb. This model experiment demonstrates the feasibility of the approach and shows that the efficiency (cost-effectiveness and speed) of existing mapping/sequencing methods could be improved at least 510-fold. Furthermore, since the efficiency of contig assembly in the slalom approach is virtually independent of length of sequence reads, even short sequences produced by rapid, high throughput sequencing techniques would suffice to complete a physical map and a sequence scan of a small genome. | INTRODUCTION |
|---|
|
|
|---|
In the past two years, impressive progress has been made in mapping and sequencing whole genomes of various organisms (15). A draft sequence of the human genome has recently been published (6,7). Two basic strategies have so far been employed for genome sequencing.
According to one scheme, the whole genome is mapped using different types of markers and a minimal tiling set of large insert clones, such as cosmids, P1 (PAC) or bacterial (BAC) artificial chromosome clones, is established. Subsequently, these large insert clones are sequenced using a shotgun sequencing strategy: small insert libraries, containing randomly sheared fragments of the large insert clones, are constructed and clones are sequenced from the ends.
The second approach, the whole genome shotgun sequencing strategy (WGS), was recently developed by Venter and colleagues (3), and has proved valuable. This method involves end sequencing of large (PACs, BACs or plasmids with a 50 kb insert) and small insert (2 and 10 kb) clones. PAC and BAC clones covering the whole genome should be carefully mapped. DNA fragments in the small insert clones are generated by physical shearing of whole genomic DNA. Some of the most recent achievements using this strategy have been determination of the nucleotide sequence of nearly the entire 120 Mb euchromatic portion of the Drosophila melanogaster genome (3,4) and generation of the draft sequence of the human genome (7). The WGS method requires the generation of sequences covering the whole genome 1015 times (4). If sequence coverage is less, then the contig assembly process cannot be completed and sequences and assembled sequences will represent mainly disconnected unordered islands. Therefore, despite impressive technological progress, mapping and sequencing of even small bacterial genomes is still expensive and laborious.
After completion of the genomic sequence from one organism there will be a great demand, in many cases, for comparisons with the genomes of other individuals, related species, etc. The growing field of comparative genomics is highly relevant for our understanding of human and animal health, epidemiology, evolution and ecology.
Such comparative techniques should be able to be applied rapidly and effectively to related bacterial strains and species, in order to identify the genomic basis for their pathogenic and biological differences, e.g. in the challenging task of studying the human intestinal flora or to identify pathogenicity islands.
The strategies available for mapping and sequencing are not equal to the challenge of high throughput comparative genomics. Alternative and complementary strategies need to be developed and the imperative now is to find cost-effective and convenient methods that allow comparative genomics projects to be undertaken by a wide range of laboratories.
We have previously suggested an approach for large scale mapping of the human genome based on shotgun sequencing of NotI jumping/linking clones (Fig. 1). We have demonstrated the validity of this method and established that jumps over 1.5 Mb can be achieved (813). A significant advantage of this procedure is that it can be automated and used to construct accurate physical and genetic maps at 100300 kb resolution. For such fine mapping other restriction enzymes, which contain a CpG pair in a shorter recognition site, such as XmaIII, XhoI or SalI, would obviously be more relevant.
|
The approach to sequence sampling that we propose in the present work will allow the establishment of a physical map with minimal sets of overlapping clones, which will pinpoint differences in genome organization between organisms. At the same time, a considerable sequence coverage of the genome (
50%) will be achieved. This will make it possible to locate virtually every gene in a genome, for more detailed study. The concept is based on alternative approaches to the construction of linking and jumping libraries (8,1416) and involves the construction of slalom libraries.
The main purpose of this work is to demonstrate the feasibility and efficiency of this technology and to show that it can be exploited in a cost-efficient manner in combination with new high throughput sequencing techniques, such as pyrosequencing or massively parallel signature sequencing (MPSS) (17,18).
| MATERIALS AND METHODS |
|---|
|
|
|---|
General molecular biology methods
Common molecular biology and microbiology procedures were performed according to standard methods.
DNA from PAC clones was isolated using a Qiagen Large-Construct Kit. Plasmid isolation was done using Biorobot 9600 and R.E.A.L. Prep 96 Biorobot kits (Qiagen) according to the manufacturers protocols. Prior to sequencing, the quality of DNA and insert size were evaluated by agarose gel electrophoresis.
Sequencing was performed using an ABI377 sequencer (Perkin Elmer, Foster City, CA) according to the manufacturers instructions.
Construction of slalom libraries
A BamHI library (B slalom library) was constructed using the pBluescriptII KS(+) (Stratagene) vector digested with BamHI and dephosphorylated with calf intestinal phosphatase (CIP). PAC DNA (3 µg) was digested with 30 U BamHI for 1 h. Upon completion of digestion, the enzyme was heat-inactivated for 20 min at 65°C. Approximately 0.5 µg digested DNA was ligated overnight in the presence of the cloning vector (0.1 µg) at room temperature. The ligation mixture was transformed into XL2-Blue cells by electroporation and DNA from white colonies was isolated and sequenced, using reverse and universal primers. An EcoRI library (R slalom library) was constructed in the same way, but the SL
B vector was used for cloning. This vector was constructed from BluescriptII KS(+), in which the BamHI site was removed without destroying the open reading frame. Therefore, the colonies with non-recombinant SL
B plasmids are blue and those with recombinant plasmids are white. To construct the connecting library (RBR slalom library), plasmid DNA was isolated from
2 x 104 pooled clones of the R slalom library and completely digested with BamHI (R-jumping DNA). The kanamycin resistance gene was obtained by PCR amplification from the pUC4K plasmid (Amersham Pharmacia Biotech). PCR primers used were: LinkB-for, 5'-GAA GGG ATC CGC TGA GGT CTG CC-3'; LinkB-rev, 5'-GAA GGG ATC CGG GGA AAG CCA CG-3'. PCR was performed in a 100 µl solution containing 67 mM TrisHCl, pH 9.1, 16.6 mM (NH4)2SO4, 1.0 mM MgCl2, 0.1% Tween-20, 200 µM dNTPs, 5 ng pUC4K DNA, 400 nM each primer and 5 U Taq DNA polymerase. The PCR cycling conditions were 95°C for 2.5 min, followed by 15 cycles of 95°C for 0.5 min, 55°C for 1 min and 72°C for 1.5 min, with a final extension at 72°C for 3 min. PCR amplified DNA was concentrated with ethanol, digested with BamHI and treated with CIP. The sample was then purified using a JETquick PCR Purification Spin Kit (Genomed Inc.) and dissolved in 100 µl of H2O (Kan-B DNA).
Ligation of 0.5 µg R-jumping DNA with 0.5 µg Kan-B DNA was performed overnight as described above. The ligate was transformed into XL2-Blue cells by electroporation and kanamycin-resistant colonies were selected and sequenced, using kanamycin-specific reverse and universal primers.
Mapping and sequencing of PAC clones 36b12 and 55a10
PAC clone 36b12 was isolated from high density gridded filters with the human genomic PAC library RPCI1 (UK HGMP Resource Centre) using as a probe a cDNA (accession no. AA429319) containing part of the SEMA IV gene.
PAC clone 55a10 was isolated using PCR primers to the 5'- and 3'-ends of the SEMA IV gene and PCR pools from the human PAC library RPCI1 (UK HGMP Resource Centre).
PCR primers used were: SEMA3F, 5'-AGT AGG GAA GCC CAG AGA AGA A-3'; SEMA3R, 5'-GGG GCC TAT TGG TAC TAT CTC C-3'; SEMA5F, 5'-ATT AAA AGG GAC AAG GGC TAG G-3'; SEMA5R, 5'-AAC AAC TTT AAG CAC GTC GTC A-3'.
PCR was performed in a 40 µl solution containing 67 mM TrisHCl, pH 9.1, 16.6 mM (NH4)2SO4, 2.0 mM MgCl2, 0.1% Tween-20, 200 µM dNTPs, 3 µl PCR pool, 400 nM each primer and 5 U Taq DNA polymerase. The PCR cycling conditions were 95°C for 1.5 min, followed by 25 cycles of 95°C for 1 min, 60°C for 1 min and 72°C for 0.5 min, with a final extension at 72°C for 3 min.
Sequencing gels were run in an ABI377 automated sequencer (Perkin Elmer), according to the manufacturers protocols using standard primers. When sequencing from the marker (KanR) fragment, the following primers were used: linkseq-for, 5'-GCT CAT AAC ACC CCT TGT-3'; linkseq-rev, 5'-CAA CCG TGG CTC CCT CAC-3'.
The slalom clones were ordered and arranged based on sequence comparisons between the three libraries.
Sequence analysis
DNA homology searches were performed in a non-redundant (nr) database using the BLASTN (19) program on the NCBI server (http://www.ncbi.nlm.nih.gov:80/BLAST). Sequence assembly was done using DNASIS v.7.00 (Hitachi Pharmacia). In all cases default parameters were used. Repeat sequences would complicate sequence analysis and clones containing repeats at the ends would lead to multiple branches in the map. To evaluate the significance of this problem, we did not used RepeatMasker.
| RESULTS AND DISCUSSION |
|---|
|
|
|---|
The idea of the slalom library approach and type I slalom libraries
The principle of the slalom libraries is depicted in Figure 2. In our jumping-linking scheme (Fig. 1) we used NotIBamHI fragments from two neighboring NotI sites as jumping clones and BamHINotI fragments surrounding the same NotI site as linking clones.
|
Let us assume that EcoRI and BamHI sites are alternating in a genome. Using the same principle that underlies NotI linking/jumping libraries, we construct EcoRI jumping and linking libraries. It is obvious that these jumping and linking libraries will be nearly identical to the two standard genomic EcoRI (R) and BamHI (B) digested libraries. For instance, a BamHI fragment from B1 to B2 will contain the EcoRI site R2 and will be equivalent to the corresponding R2 linking clone (Fig. 2, bottom). The same is true for the R library: the EcoRI fragment from R2 to R3 will be equivalent to the jumping clone from R2 to R3 (Fig. 2, top). Therefore, instead of making the more complicated jumping and linking libraries, we can simply replace them with standard R and B libraries. The only problem is how to put EcoRI and BamHI fragments in the correct order. If we sequence with standard primers, we will obtain sequences near EcoRI sites in the first case and sequences near BamHI sites in the second case (Fig. 2, R and B libraries). The ordering problem can be solved in different ways. One way is to create a real EcoRI linking library using the enzyme BamHI, instead of the simple BamHI digested library. The EcoRI linking library (BR library) is constructed in exactly the same way as the NotI linking library: genomic DNA is digested with BamHI, circularized, opened with EcoRI and cloned (10; Figs 2 and 3, left).
|
Sequences from the BR and R libraries are produced using standard reverse and forward sequencing primers and overlapping clones can be identified using a computer program (Fig. 2). The identified sequence matches will correlate the ends of EcoRI fragments in the BR library with the ends of EcoRI fragments in the R library, to yield an ordered set of BamHIEcoRI clones or sequence tagged sites (STS) distributed along the genome. A set of minimally overlapping clones covering the whole genome can be created using this information.
Figure 2 shows the scheme when EcoRI and BamHI sites alternate. In reality, EcoRI and BamHI sites do not always alternate throughout genomes, and this will lead to gaps in this set of overlapping clones if several EcoRI or BamHI sites are located together. Therefore, the genome will not be completely covered with clones, because information between some EcoRI and BamHI sites will be lost. However, this strategy will result in a set of clones that will cover the whole genome. Only two libraries are employed in this variant of the slalom approach (type I).
Type II slalom libraries
The problem with the gaps introduced in the type I slalom libraries scheme can be solved using the type II slalom libraries approach, in which three libraries are utilized (Figs 3 and 4): an EcoRI jumping (slalom RBR or connecting) library is constructed in addition to the R and B libraries.
|
To construct R and B slalom libraries, genomic DNA is digested with a restriction enzyme (EcoRI or BamHI, respectively) and cloned in the appropriate vector. The RBR library can be constructed in different ways. For example, DNA isolated en masse from a slalom R library is digested with BamHI, circularized in the presence of the KanR gene and plated on agar with kanamycin (Fig. 3, right). The clones isolated in this manner will be practically (EcoRIEcoRI fragments will be missing) identical in structure to the clones from an EcoRI jumping library prepared in the classical way (8,16).
By comparing end sequences of the B library clones with the internal BamHI sequences (from the marker fragments) of the slalom RBR library clones, the BamHI clones can be positioned relative to each other. After comparing end sequences in the R and RBR libraries, EcoRI clones can be positioned relative to each other. Finally, EcoRI and BamHI clones can be assembled into a contig representing their physical positions in the complete genome.
It is clear that a type II slalom library can be used in an express version, like a type I slalom library. Only two libraries are used: a B library that will be sequenced with the standard reverse and forward primers and a slalom RBR library that will be sequenced using marker-specific primers. This scheme is, in reality, identical to the scheme shown in Figure 2, the only difference being that in the original scheme R and slalom BR libraries are used.
Mapping and sequencing of PAC clones 36b12 and 55a10
To demonstrate the feasibility of the method, two PAC clones (36b12 and 55a10), each containing an
100 kb insert of human DNA, were mapped and partially sequenced using the type II slalom library. 180 subclones of PAC clone 36b12 and 204 subclones of PAC clone 55a10 were sequenced (Table 1). The average length of the sequence reads was 680 bp, the accuracy was 99.5% (compared to the human genome sequence available in the EMBL database). A single read sequence from one of several identical clones/sequences was used for the alignment.
|
Initially, all clones were sequenced with the reverse primer and only unique clones were sequenced with the forward primer and, if necessary, with the kanamycin-specific primer.
Altogether, 342 kb of sequence was generated, corresponding to a 1.5-fold coverage of the PAC clones. Five clones were present only once and others 215 times. These sequences were sufficient to order clones and to cover the two PACs completely. A minimal tiling set of the clones was also established (Figs 5 and 6). As each time we isolated plasmid DNA from 96 clones using a Biorobot 9600 and sequencing was done in a ABI377 analyzer, it is quite possible that the same results could have been achieved after less sequencing.
|
|
A BLAST (19) search of the sequences showed that these regions of the human genome had already been sequenced. PAC clone 36b12 represents sequences from 37 455 to 131 808 (GenBank accession no. AC008064) and PAC clone 55a10 represents sequences 12 936 to 110 500 (GenBank accession no. NT_000067). The alignment of our map with the sequenced human genomic fragments showed that some small BamHI and EcoRI fragments were missing in our scheme. Alternatively, they may represent restriction fragment polymorphisms. However, this does not constitute a problem for the final tiling paths that completely cover the entire PACs.
If we used only our sequence information (i.e. as in sequencing a new genome) then a complete contig of overlapping clones could be established and this approach would generate 19.7 kb of PAC 36b12 (20.9% of all insert sequence) and 22.4 kb (23.0%) of PAC 55a10. These will be ordered sequences, i.e. the distance between sequences will be known because the insert size in each clone was established before sequencing (see Materials and Methods). The largest sequence contig was 4.7 kb (average size
1 kb) and BamHI and EcoRI sequences overlapped only eight times. These overlaps were together less than 3.7 kb. If we used added mapping information from accession nos AC008064 and NT_000067 (i.e. as in comparing two related genomes) it would generate a total of 47.3 kb of sequence for PAC 36b12 (50.1%) and 41.5 kb for PAC 55a10 (42.5%). It is worthwhile mentioning that the SEMA IV gene was successfully detected with these slalom libraries.
Interestingly, large BamHI and EcoRI fragments (>20 kb) were successfully cloned, but the choice of enzymes was not optimal to obtain maximum sequencing information, because some fragments were too large (the largest fragment was almost 27 kb in size). The optimal size of the fragments for slalom libraries is 34 kb and for different genomes optimal combination of the enzymes should be established before construction of slalom libraries.
Perspectives
The major difference between the slalom library mapping/sequencing approach and the WGS strategy is that the clones are generated according to a specific scheme and using complete digestion. As a result, the number of variants required to cover the whole genome decreases significantly. The preparation of libraries for the slalom approach is remarkably simple: only complete digestion with EcoRI or BamHI is used. There is no need for size separation, agarose gel purification or establishing conditions for partial shearing/digestion. It is important to mention that there is no need to keep all slalom clones because sequencing information can be used to design PCR primers and even large fragments (up to 4050 kb) can be amplified by long-range PCR.
The slalom library approach differs fundamentally from the shotgun sequencing approach with respect to the efficiency of assembly (EOA). The EOA for the latter method is strictly dependent on the length of sequencing reads. The longer the reads, the higher the EOA. As the slalom approach uses non-random fractionation of the DNA and each start site is tightly linked with the recognition site for the restriction enzyme, even very short sequences will, in principle, be enough to create a contig of the overlapping clones. The EOA of the slalom library approach is, therefore, essentially independent of read length. Even the short sequences generated by pyrosequencing or MPSS (17,18) should suffice, in principle. As one person can generate thousands of sequences a day using a pyrosequencer, the minimal set of overlapping clones covering a 4 Mb genome can be completed in a couple of days.
We decided to test whether smaller flanking sequences can be successfully used for the ordering of clones. After shortening all sequences to only 20 bp from the restriction enzyme recognition sequences, the EcoRI and BamHI sites, it was possible to successfully reconstruct the same clone contigs. Therefore, pyrosequencing can be combined with this technique to generate a minimal tiling path and only unique clones need be selected to generate sequence information. The use of this high throughput technique (which is not compatible with the shotgun sequencing approach) highlights the advantage of our slalom approach: 4% coverage is enough to construct a complete contig of overlapping clones covering both PACs. As was mentioned before, sequencing of unique clones produced >20% of ordered sequences and for comparative analysis ~50% of all sequences.
Before sequencing we usually check the quality of plasmid DNA by agarose gel electrophoresis. Therefore, the size of inserts is known and the distance between sequences can be easily determined. This means that even without sequencing a given genome, its size can be precisely established.
We specifically selected human DNA for testing the slalom scheme because human DNA contains a number of repetitive sequences that create serious problems in establishing the complete human genome sequence. As was mentioned in Materials and Methods, to evaluate the significance of this problem, we did not use RepeatMasker. The content of repeat sequences in the PAC clones is shown in Table 2. One PAC clone (36b12) contained more repeats than the human genome on average and another (55a10) slightly less. This difference may result from different GC contents: high for 55a10 (54%) and low for 36b12 (40%). Altogether, these two PACs contained a rather large fraction of repeat sequences, constituting a good representation of the human genome, and we did not have any problems with repeats, even when using only 20 bp end sequences. Repeat sequences may in fact be less of a problem for the slalom approach than for the shotgun sequencing approach for a number of reasons. First, restriction enzymes can be selected that do not cut inside the major repeats, i.e. Alu repeats, LINE repeats, etc. Secondly, long repeats will have unique positions in slalom clones as the second recognition site will be outside the repeat. Thirdly, short repeated elements can be sequenced without serious problems because they will be located within one particular clone (one insert), in contrast to the shotgun sequencing approach where the particular repeat is represented in many different clones and the main problem is to understand if it is really the same repeat or closely related.
|
Based on these experiments, we believe that the slalom library approach is suitable for at least two major applications. (i) For mapping and sequencing large genomes (e.g. of mammals). In this case, the method should be applied to the sequencing of clones with large inserts, e.g. BACs or PACs. (ii) For mapping and sequencing small genomes (e.g. bacterial). Here, the method can be applied directly to the whole genome.
It is important to stress that the benefits of the slalom approach are most obvious in comparative sequencing experiments in combination with high throughput techniques like pyrosequencing or MPSS.
For partial genome sequencing (e.g. for comparing bacterial strains), the minimal set of clones covering the genome is established using pyrosequencing and then sequences covering 50% of the whole genome are generated. The efficiency of this approach will be close to 100% (unique sequence information compared to all generated sequences). Sequenced islands will be separated by gaps, but clones covering these gaps will be available and the order of islands/gaps will be known. Therefore, it will be easy to identify interesting genomic regions, e.g. a pathogenic island, and sequence it.
If the aim is to completely sequence a given genome, then sequence gaps can be filled using any of the standard methods, such as primer walking or transposon-mediated sequencing (20). Of course, closing of the gaps will be done with significantly lower efficiency. However, the finishing stages of the shotgun sequencing approach are also the most expensive and time consuming part of the process. Lander et al. (6) distinguished three types of gaps: gaps within unfinished sequenced clones; gaps between sequenced clone contigs, but within fingerprint clone contigs; gaps between fingerprint clone contigs. The first type is the simplest and the third is the most complicated to close because constructing a contig of overlapping clones is the most difficult procedure. With the slalom approach we already have a contig of overlapping clones and thus we will only suffer from the first, simplest type of gap. It is important to mention another difference in the finishing stages of these two approaches. With the shotgun sequencing approach sequences from different clones must be connected and here highly related repeats, gene families and polymorphisms will represent a major problem. These problems are non-existent in the slalom approach, where a single insert should be sequenced.
In summary, in this study we applied a novel strategy for the mapping and partial sequencing of two human PACs (230 kb) which contained
38% different repeats. One and a half sequence coverage was achieved (342 kb). Without added mapping information and at the same sequence coverage (1.5-fold) the shotgun sequencing approach can generate practically no ordered sequence information/contigs of overlapping clones. The slalom approach generated a complete contig of overlapping clones and >20% of sequences were ordered (in addition, 2530% of sequences were not ordered, but can be ordered using additional mapping information). Moreover, as calculations showed, in our particular experiment
4% coverage would be enough to construct the contig of the overlapping clones and subsequently generate >20% ordered sequences with almost 100% efficiency, i.e. with 0.2-fold coverage. The benefits of the slalom approach will be most obvious in comparative sequencing experiments in combination with new high throughput sequencing technologies which cannot be used in the shotgun sequencing approach.
| ACKNOWLEDGEMENTS |
|---|
|
|
|---|
The authors are grateful to Dr Michael Lerman for fruitful discussions and valuable advice. This work was supported by research grants from the Swedish Cancer Society, the Swedish Research Council for Engineering Sciences, Ingabritt och Arne Lundbergs Forskningsstiftelse, the Royal Swedish Academy of Sciences, Pharmacia Corporation Center for Genomics Research and the Karolinska Institute.
| FOOTNOTES |
|---|
|
|
|---|
* To whom correspondence should be addressed at: Microbiology and Tumor Biology Center, Karolinska Institute, 171 77 Stockholm, Sweden. Tel: +46 8 728 67 50; Fax: +46 8 31 94 70; Email: eugzab{at}ki.se
| REFERENCES |
|---|
|
|
|---|
-
1 Dunham,I., Hunt,A.R., Collins,J.E., Bruskiewich,R., Beare,D.M., Clamp,M., Smink,L.J., Ainscough,R., Almeida,J.P., Babbage,A. et al. (1999) The DNA sequence of human chromosome 22. Nature, 402, 489495.[Medline]
2 Hattory,M., Fujiyama,A., Taylor,T.D., Watanabe,H., Yada,T., Park,H.-S., Toyoda,A., Ishii,K., Totoki,Y., Choi,D.-K. et al. (2000) The DNA sequence of human chromosome 21. Nature, 405, 311319.[Medline]
3 Adams,M.D., Celniker,S.E., Holt,R.A., Evans,C.A., Gocayne,J.D., Amanatides,P.G., Scherer,S.E., Li,P.W., Hoskins,R.A., Galle,R.F. et al. (2000) The genome sequence of Drosophila melanogaster. Science, 287, 21852195.
4 Myers,E.W., Sutton,G.G., Delcher,A.L., Dew,I.M., Fasulo,D.P., Flanigan,M.J., Kravitz,S.A., Mobarry,C.M., Reinert,K.H., Remington,K.A. et al. (2000) A whole-genome assembly of Drosophila. Science, 287, 21962204.
5 Broder,S. and Venter,J.C. (2000) Sequencing the entire genomes of free-living organisms: the foundation of pharmacology in the new millennium. Annu. Rev. Pharmacol. Toxicol., 40, 97132.[ISI][Medline]
6 Lander,E.S., Linton,L.M., Birren,B., Nusbaum,C., Zody,M.C., Baldwin,J., Devon,K., Dewar,K., Doyle,M., FitzHugh,W. et al. (2001) Initial sequencing and analysis of the human genome. International Human Genome Sequencing Consortium. Nature, 409, 860921.[Medline]
7 Venter,J.C., Adams,M.D., Myers,E.W., Li,P.W., Mural,R.J., Sutton,G.G., Smith,H.O., Yandell,M., Evans,C.A., Holt,R.A. et al. (2001) The sequence of the human genome. Science, 291, 13041351.
8 Zabarovsky,E.R., Boldog,F., Erlandsson,R., Allikmets,R.L., Kashuba,V.I., Marcsek,Z., Stanbridge,E., Sumegi,J., Klein G. and Winberg G. (1991) New strategy for mapping the human genome based on a novel procedure for construction of jumping libraries. Genomics, 11, 10301039.[ISI][Medline]
9 Zabarovsky,E.R., Kashuba,V.I., Zakharyev,V.M., Petrov,N., Pettersson,B., Lebedeva,T., Gizatullin,R., Pokrovskaya,E.S., Bannikov,V.M., Zabarovska,V.I. et al. (1994) Shot-gun sequencing strategy for long-range genome mapping: a pilot study. Genomics, 21, 495500.[ISI][Medline]
10 Zabarovsky,E.R., Kashuba,V.I., Gizatullin,R.Z., Winberg,G., Zabarovska,V.I., Erlandsson,R., Domninsky,D.A., Bannikov,V.M., Pokrovskaya,E., Kholodnyuk,I. et al. (1996) NotI jumping and linking clones as a tool for genome mapping and analysis of chromosome rearrangements in different tumors. Cancer Detect. Prev., 20, 110.[ISI][Medline]
11 Kashuba,V.I., Szeles,A., Allikmets,R., Nilsson,A.S., Bergerheim,U.S., Modi,W., Grafodatsky,A., Dean,M., Stanbridge,E.J., Winberg,G. et al. (1995) A group of NotI jumping and linking clones cover 2.5 Mb in the 3p21-p22 region suspected to contain a tumor suppressor gene. Cancer Genet. Cytogenet., 81, 144150.
12 Kashuba,V.I., Gizatullin,R.G., Protopopov,A.I., Allikmets,R., Korolev,S., Li,J., Boldog,F., Tory,K., Zabarovska,V.I., Marcsek,Z. et al. (1997) NotI linking/jumping clones of human chromosome 3: mapping of the TFRC, RAB7 and HAUSP genes to regions rearranged in leukemia and deleted in solid tumors. FEBS Lett., 419, 181185.[ISI][Medline]
13 Kashuba,V.I., Gizatullin,R.Z., Protopopov,A.I., Li,J., Vorobieva,N.V., Fedorova,L., Zabarovska,V.I., Muravenko,O.V., Kost-Alimova,M., Domninsky,D.A. et al. (1999) Analysis of NotI linking clones isolated from human chromosome 3 specific libraries. Gene, 239, 259271.[ISI][Medline]
14 Collins,F.S. and Weissman,S.M. (1984) Directional cloning of DNA fragments at a large distance from an initial probe: a circularisation method. Proc. Natl Acad. Sci. USA, 81, 68126816.
15 Poustka,A., Pohl,T.M., Barlow,D.P., Frischauf,A.M. and Lehrach,H. (1987) Construction and use of chromosome jumping libraries from NotI-digested DNA. Nature, 325, 353355.[Medline]
16 Zabarovsky,E.R., Winberg,G. and Klein,G. (1993) The SK-diphasmidsvectors for genomic, jumping and cDNA libraries. Gene, 127, 114.[ISI][Medline]
17 Ronaghi,M., Uhlen,M. and Nyren,P. (1998) A sequencing method based on real-time pyrophosphate. Science, 281, 363365.
18 Brenner,S., Johnson,M., Bridgham,J., Golda,G., Lloyd,D.H., Johnson,D., Luo,S., McCurdy,S., Foy,M., Ewan,M. et al. (2000) Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat. Biotechnol., 18, 630634.[ISI][Medline]
19 Altschul,S.F., Gish,W., Miller,W., Myers,E.W. and Lipman,D.J. (1990) Basic local alignment search tool. J. Mol. Biol., 215, 403410.[ISI][Medline]
20 Haapa,S., Taira,S., Heikkinen,E. and Savilahti,H. (1999) An efficient and accurate integration of mini-Mu transposons in vitro: a general methodology for functional genetic analysis and molecular biology applications. Nucleic Acids Res., 27, 27772784.
This article has been cited by other articles:
![]() |
S. A. F. T. van Hijum, A. L. Zomer, O. P. Kuipers, and J. Kok Projector 2: contig mapping for efficient gap-closure of prokaryotic genome sequence assemblies Nucleic Acids Res., July 1, 2005; 33(suppl_2): W560 - W566. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. A. F. T. van Hijum, A. L. Zomer, O. P. Kuipers, and J. Kok Projector: automatic contig mapping for gap closure purposes Nucleic Acids Res., November 15, 2003; 31(22): e144 - e144. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. R. Zabarovsky, L. Petrenko, A. Protopopov, O. Vorontsova, A. S. Kutsenko, Y. Zhao, G. Kilosanidze, V. Zabarovska, E. Rakhmanaliev, B. Pettersson, et al. Restriction site tagged (RST) microarrays: a novel technique to study the species composition of complex microbial systems Nucleic Acids Res., August 15, 2003; 31(16): e95 - e95. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||






