Nucleic Acids Research, 2002, Vol. 30, No. 14 3163-3170
© 2002 Oxford University Press
NotI flanking sequences: a tool for gene discovery and verification of the human genome
1 Center for Genomics and Bioinformatics and 2 Microbiology and Tumor Biology Center, Karolinska Institute, 171 77 Stockholm, Sweden, 3 Institute of Cytology and Genetics, Russian Academy of Science, 630 090 Novosibirsk, Russia and 4 Engelhardt Institute of Molecular Biology, Russian Academy of Sciences, 119 991 Moscow, Russia
*To whom correspondence should be addressed at: Microbiology and Tumor Biology Center, Karolinska Institute, Box 280, 171 77 Stockholm, Sweden. Tel: +46 8 728 6750; Fax: +46 8 319 470; Email: eugzab{at}ki.se
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors
+AQ936570AQ939834, AJ322533AJ343893
A set of 22 551 unique human NotI flanking sequences (16.2 Mb) was generated. More than 40% of the set had regions with significant similarity to known proteins and expressed sequences. The data demonstrate that regions flanking NotI sites are less likely to form nucleosomes efficiently and resemble promoter regions. The draft human genome sequence contained 55.7% of the NotI flanking sequences, Celeras database contained matches to 57.2% of the clones and all public databases (including non-human and previously sequenced NotI flanks) matched 89.2% of the NotI flanking sequences (identity
90% over at least 50 bp, data from December 2001). The data suggest that the shotgun sequencing approach used to generate the draft human genome sequence resulted in a bias against cloning and sequencing of NotI flanks. A rough estimation (based primarily on chromosomes 21 and 22) is that the human genome contains 15 00020 000 NotI sites, of which 60009000 are unmethylated in any particular cell. The results of the study suggest that the existing tools for computational determination of CpG islands fail to identify a significant fraction of functional CpG islands, and unmethylated DNA stretches with a high frequency of CpG dinucleotides can be found even in regions with low CG content.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
V. G. Levitsky, A. V. Katokhin, O. A. Podkolodnaya, D. P. Furman, and N. A. Kolchanov NPRD: Nucleosome Positioning Region Database Nucleic Acids Res., January 1, 2005; 33(suppl_1): D67 - D70. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. G. Levitsky RECON: a program for prediction of nucleosome formation potential Nucleic Acids Res., July 1, 2004; 32(suppl_2): W346 - W349. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Zabarovska, A. S. Kutsenko, L. Petrenko, G. Kilosanidze, O. Ljungqvist, E. Norin, T. Midtvedt, G. Winberg, R. Mollby, V. I. Kashuba, et al. NotI passporting to identify species composition of complex microbial systems Nucleic Acids Res., January 15, 2003; 31(2): e5 - e5. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Protopopov, V. Kashuba, V. I. Zabarovska, O. V. Muravenko, M. I. Lerman, G. Klein, and E. R. Zabarovsky An Integrated Physical and Gene Map of the 3.5-Mb Chromosome 3p21.3 (AP20) Region Implicated in Major Human Epithelial Malignancies Cancer Res., January 15, 2003; 63(2): 404 - 412. [Abstract] [Full Text] [PDF] |
||||

