Nucleic Acids Research, Vol 24, Issue 21 4263-4272, Copyright © 1996 by Oxford University Press
S Karlin, J Mrazek and AM Campbell
The complete Haemophilus influenzae genome (1.83 Mb, Rd strain) provides
opportunities for characterizing global genomic inhomogeneities and for
detecting important sequence signals. Along these lines, new methods for
identifying frequent words (oligonucleotides and/or peptides) and their
distributions are applied to the H.influenzae genome with some comparisons
and contrasts made with frequent words of other bacterial genomes. Three
major classes of frequent oligonucleotides stand out: (i) oligos related to
the familiar uptake signal sequences (USSs), AAGTGCGGT (USS+) and its
inverted complement (USS-), (ii) multiple tetranucleotide iterations and
(iii) intergenic dyad sequences (ISDs) found as AAGCCCACCCTAC and its dyad
form. The USS+ and USS- occur in almost equal counts, are remarkably evenly
spaced around the genome, and appear predominantly in the same reading
frame of protein coding domains (USS+ translated to Ser-Ala- Val, USS-
translated to Thr-Ala-Leu). These observations suggest that USSs contribute
to global genomic functions, for example, in replication and/or repair
processes, or as membrane attachment sites, or as sequences helping to pack
DNA. The long tetranucleotide iterations, virtually unique to H.influenzae
(i.e., unknown in other prokaryotes), through polymerase slippage during
replication and/or homologous recombination may produce subpopulations
expressing alternative proteins. The 13 bp frequent IDS words, invariably
intergenic, occur mostly in clusters and provide potential for complex
secondary structures suggesting that these sequences may be important
signals for regulating the activity of their flanking genes. The frequent
oligopeptides of H.influenzae are principally of two kinds-- those induced
by oligonucleotide frequent words (USSs, tetranucleotide iterations), and
those associated with ATP or GTP binding sites that are generally composed
of three motifs: the A-box which contributes to delineating the binding
pocket; the B-box which functions in hydrolysis; and the C-box whose
function is unknown. The A-box occurs fairly universally in prokaryotes and
eukaryotes. The B- and C-motifs appear to be specialized to various
functional groups (e.g., transport, recombination, chaperone activity).
Other putative motifs correspond to homologs of Escherichia coli motifs,
for example, are associated with proteins of transcriptional processing,
aminoacyl-tRNA synthetases and proteins functioning in electron transfer.
ARTICLES
Frequent oligonucleotides and peptides of the Haemophilus influenzae genome
Department of Mathematics, Stanford University, CA 94305-2125, USA.
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
J. Mrazek, S. Xie, X. Guo, and A. Srivastava AIMIE: a web-based environment for detection and interpretation of significant sequence motifs in prokaryotic genomes Bioinformatics, April 15, 2008; 24(8): 1041 - 1048. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Mrazek, X. Guo, and A. Shah Simple sequence repeats in prokaryotic genomes PNAS, May 15, 2007; 104(20): 8472 - 8477. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Mrazek and S. Karlin Distinctive features of large complex virus genomes and proteomes PNAS, March 20, 2007; 104(12): 5127 - 5132. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. H. Badger, T. R. Hoover, Y. V. Brun, R. M. Weiner, M. T. Laub, G. Alexandre, J. Mrazek, Q. Ren, I. T. Paulsen, K. E. Nelson, et al. Comparative Genomic Evidence for a Close Relationship between the Dimorphic Prosthecate Bacteria Hyphomonas neptunium and Caulobacter crescentus. J. Bacteriol., October 1, 2006; 188(19): 6841 - 6850. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Mrazek Analysis of Distribution Indicates Diverse Functions of Simple Sequence Repeats in Mycoplasma Genomes Mol. Biol. Evol., July 1, 2006; 23(7): 1370 - 1385. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Karlin Colloquium Perspective: Statistical signals in bioinformatics PNAS, September 20, 2005; 102(38): 13355 - 13362. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Bakkali, T.-Y. Chen, H. C. Lee, and R. J. Redfield Evolutionary stability of DNA uptake signal sequences in the Pasteurellaceae PNAS, March 30, 2004; 101(13): 4513 - 4518. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Davidsen, E. A. Rodland, K. Lagesen, E. Seeberg, T. Rognes, and T. Tonjum Biased distribution of DNA uptake sequences towards genome maintenance genes Nucleic Acids Res., February 11, 2004; 32(3): 1050 - 1058. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Bruant, S. Watt, R. Quentin, and A. Rosenau Typing of Nonencapsulated Haemophilus Strains by Repetitive-Element Sequence-Based PCR Using Intergenic Dyad Sequences J. Clin. Microbiol., August 1, 2003; 41(8): 3473 - 3480. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Kolker, S. Purvine, M. Y. Galperin, S. Stolyar, D. R. Goodlett, A. I. Nesvizhskii, A. Keller, T. Xie, J. K. Eng, E. Yi, et al. Initial Proteome Analysis of Model Microorganism Haemophilus influenzae Strain Rd KW20 J. Bacteriol., August 1, 2003; 185(15): 4593 - 4602. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Mrazek, L. H. Gaynon, and S. Karlin Frequent oligonucleotide motifs in genomes of three streptococci Nucleic Acids Res., October 1, 2002; 30(19): 4216 - 4221. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Mrazek, D. Bhaya, A. R. Grossman, and S. Karlin Highly expressed and alien genes of the Synechocystis genome Nucleic Acids Res., April 1, 2001; 29(7): 1590 - 1601. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Bhaya, D. Vaulot, P. Amin, A. W. Takahashi, and A. R. Grossman Isolation of Regulated Genes of the Cyanobacterium Synechocystis sp. Strain PCC 6803 by Differential Display J. Bacteriol., October 15, 2000; 182(20): 5692 - 5699. [Abstract] [Full Text] |
||||
![]() |
S. Karlin and J. Mrázek Predicted Highly Expressed Genes of Diverse Prokaryotic Genomes J. Bacteriol., September 15, 2000; 182(18): 5238 - 5250. [Abstract] [Full Text] |
||||





