Skip Navigation

This Article
Right arrow Full Text Freely available
Right arrow Print PDF (163K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (51)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Karlin, S
Right arrow Articles by Campbell, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Karlin, S
Right arrow Articles by Campbell, A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, Vol 24, Issue 21 4263-4272, Copyright © 1996 by Oxford University Press


ARTICLES

Frequent oligonucleotides and peptides of the Haemophilus influenzae genome

S Karlin, J Mrazek and AM Campbell
Department of Mathematics, Stanford University, CA 94305-2125, USA.

The complete Haemophilus influenzae genome (1.83 Mb, Rd strain) provides opportunities for characterizing global genomic inhomogeneities and for detecting important sequence signals. Along these lines, new methods for identifying frequent words (oligonucleotides and/or peptides) and their distributions are applied to the H.influenzae genome with some comparisons and contrasts made with frequent words of other bacterial genomes. Three major classes of frequent oligonucleotides stand out: (i) oligos related to the familiar uptake signal sequences (USSs), AAGTGCGGT (USS+) and its inverted complement (USS-), (ii) multiple tetranucleotide iterations and (iii) intergenic dyad sequences (ISDs) found as AAGCCCACCCTAC and its dyad form. The USS+ and USS- occur in almost equal counts, are remarkably evenly spaced around the genome, and appear predominantly in the same reading frame of protein coding domains (USS+ translated to Ser-Ala- Val, USS- translated to Thr-Ala-Leu). These observations suggest that USSs contribute to global genomic functions, for example, in replication and/or repair processes, or as membrane attachment sites, or as sequences helping to pack DNA. The long tetranucleotide iterations, virtually unique to H.influenzae (i.e., unknown in other prokaryotes), through polymerase slippage during replication and/or homologous recombination may produce subpopulations expressing alternative proteins. The 13 bp frequent IDS words, invariably intergenic, occur mostly in clusters and provide potential for complex secondary structures suggesting that these sequences may be important signals for regulating the activity of their flanking genes. The frequent oligopeptides of H.influenzae are principally of two kinds-- those induced by oligonucleotide frequent words (USSs, tetranucleotide iterations), and those associated with ATP or GTP binding sites that are generally composed of three motifs: the A-box which contributes to delineating the binding pocket; the B-box which functions in hydrolysis; and the C-box whose function is unknown. The A-box occurs fairly universally in prokaryotes and eukaryotes. The B- and C-motifs appear to be specialized to various functional groups (e.g., transport, recombination, chaperone activity). Other putative motifs correspond to homologs of Escherichia coli motifs, for example, are associated with proteins of transcriptional processing, aminoacyl-tRNA synthetases and proteins functioning in electron transfer.
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
BioinformaticsHome page
J. Mrazek, S. Xie, X. Guo, and A. Srivastava
AIMIE: a web-based environment for detection and interpretation of significant sequence motifs in prokaryotic genomes
Bioinformatics, April 15, 2008; 24(8): 1041 - 1048.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. Mrazek, X. Guo, and A. Shah
Simple sequence repeats in prokaryotic genomes
PNAS, May 15, 2007; 104(20): 8472 - 8477.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. Mrazek and S. Karlin
Distinctive features of large complex virus genomes and proteomes
PNAS, March 20, 2007; 104(12): 5127 - 5132.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
J. H. Badger, T. R. Hoover, Y. V. Brun, R. M. Weiner, M. T. Laub, G. Alexandre, J. Mrazek, Q. Ren, I. T. Paulsen, K. E. Nelson, et al.
Comparative Genomic Evidence for a Close Relationship between the Dimorphic Prosthecate Bacteria Hyphomonas neptunium and Caulobacter crescentus.
J. Bacteriol., October 1, 2006; 188(19): 6841 - 6850.
[Abstract] [Full Text] [PDF]


Home page
Mol Biol EvolHome page
J. Mrazek
Analysis of Distribution Indicates Diverse Functions of Simple Sequence Repeats in Mycoplasma Genomes
Mol. Biol. Evol., July 1, 2006; 23(7): 1370 - 1385.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
S. Karlin
Colloquium Perspective: Statistical signals in bioinformatics
PNAS, September 20, 2005; 102(38): 13355 - 13362.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
M. Bakkali, T.-Y. Chen, H. C. Lee, and R. J. Redfield
Evolutionary stability of DNA uptake signal sequences in the Pasteurellaceae
PNAS, March 30, 2004; 101(13): 4513 - 4518.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
T. Davidsen, E. A. Rodland, K. Lagesen, E. Seeberg, T. Rognes, and T. Tonjum
Biased distribution of DNA uptake sequences towards genome maintenance genes
Nucleic Acids Res., February 11, 2004; 32(3): 1050 - 1058.
[Abstract] [Full Text] [PDF]


Home page
J. Clin. Microbiol.Home page
G. Bruant, S. Watt, R. Quentin, and A. Rosenau
Typing of Nonencapsulated Haemophilus Strains by Repetitive-Element Sequence-Based PCR Using Intergenic Dyad Sequences
J. Clin. Microbiol., August 1, 2003; 41(8): 3473 - 3480.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
E. Kolker, S. Purvine, M. Y. Galperin, S. Stolyar, D. R. Goodlett, A. I. Nesvizhskii, A. Keller, T. Xie, J. K. Eng, E. Yi, et al.
Initial Proteome Analysis of Model Microorganism Haemophilus influenzae Strain Rd KW20
J. Bacteriol., August 1, 2003; 185(15): 4593 - 4602.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Mrazek, L. H. Gaynon, and S. Karlin
Frequent oligonucleotide motifs in genomes of three streptococci
Nucleic Acids Res., October 1, 2002; 30(19): 4216 - 4221.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
J. Mrazek, D. Bhaya, A. R. Grossman, and S. Karlin
Highly expressed and alien genes of the Synechocystis genome
Nucleic Acids Res., April 1, 2001; 29(7): 1590 - 1601.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
D. Bhaya, D. Vaulot, P. Amin, A. W. Takahashi, and A. R. Grossman
Isolation of Regulated Genes of the Cyanobacterium Synechocystis sp. Strain PCC 6803 by Differential Display
J. Bacteriol., October 15, 2000; 182(20): 5692 - 5699.
[Abstract] [Full Text]


Home page
J. Bacteriol.Home page
S. Karlin and J. Mrázek
Predicted Highly Expressed Genes of Diverse Prokaryotic Genomes
J. Bacteriol., September 15, 2000; 182(18): 5238 - 5250.
[Abstract] [Full Text]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.