Nucleic Acids Research, 2002, Vol. 30, No. 6 1418-1426
© 2002 Oxford University Press
Mining Bacillus subtilis chromosome heterogeneities using hidden Markov models
1Laboratoire de Mathématique, Informatique et Génome, INRA, Route de Saint-Cyr, F-78026 Versailles cedex, France, 2Laboratoire de Statistique et Génome, CNRS, Tour Évry2, 523 place des terrasses de lAgora, F-91034 Évry, France and 3Laboratoire de Génétique Microbienne, INRA, F-78352 Jouy-en-Josas cedex, France
We present here the use of a new statistical segmentation method on the Bacillus subtilis chromosome sequence. Maximum likelihood parameter estimation of a hidden Markov model, based on the expectation-maximization algorithm, enables one to segment the DNA sequence according to its local composition. This approach is not based on sliding windows; it enables different compositional classes to be separated without prior knowledge of their content, size and localization. We compared these compositional classes, obtained from the sequence, with the annotated DNA physical map, sequence homologies and repeat regions. The first heterogeneity revealed discriminates between the two coding strands and the non-coding regions. Other main heterogeneities arise; some are related to horizontal gene transfer, some to t-enriched composition of hydrophobic protein coding strands, and others to the codon usage fitness of highly expressed genes. Concerning potential and established gene transfers, we found 9 of the 10 known prophages, plus 14 new regions of atypical composition. Some of them are surrounded by repeats, most of their genes have unknown function or possess homology to genes involved in secondary catabolism, metal and antibiotic resistance. Surprisingly, we notice that all of these detected regions are a + t-richer than the host genome, raising the question of their remote sources.
* To whom correspondence should be addressed at: Laboratoire de Mathématique, Informatique et Génome, INRA, Route de Saint-Cyr, F-78026 Versailles cedex, France. Tel: +33 1 30 83 33 52; Fax: +33 1 30 83 33 59; Email: nicolas{at}versailles.inra.fr Present address:Laurent Bize, Laboratoire de Biométrie, INRA, chemin de Borde-Rouge, Auzeville, BP 27, F-31326 Castanet-Tolosan cedex, France
![]()
CiteULike
Connotea
Del.icio.us What's this?
This article has been cited by other articles:
![]() |
M. Ibrahim, P. Nicolas, P. Bessieres, A. Bolotin, V. Monnet, and R. Gardan A genome-wide survey of short coding sequences in streptococci Microbiology, November 1, 2007; 153(11): 3631 - 3644. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Bryson, V. Loux, R. Bossy, P. Nicolas, S. Chaillou, M. van de Guchte, S. Penaud, E. Maguin, M. Hoebeke, P. Bessieres, et al. AGMIAL: implementing an annotation strategy for prokaryote genomes as a distributed system Nucleic Acids Res., July 19, 2006; 34(12): 3533 - 3545. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Bekaert, H. Richard, B. Prum, and J.-P. Rousset Identification of programmed translational -1 frameshifting sites in the genome of Saccharomyces cerevisiae Genome Res., October 1, 2005; 15(10): 1411 - 1420. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Gueguen Sarment: Python modules for HMM analysis and partitioning of sequences Bioinformatics, August 15, 2005; 21(16): 3427 - 3428. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Fertil, M. Massin, S. Lespinats, C. Devic, P. Dumee, and A. Giron GENSTYLE: exploration and analysis of DNA sequences with genomic signature Nucleic Acids Res., July 1, 2005; 33(suppl_2): W512 - W515. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Dufraigne, B. Fertil, S. Lespinats, A. Giron, and P. Deschavanne Detection and characterization of horizontal transfers in prokaryotes using genomic signature Nucleic Acids Res., January 13, 2005; 33(1): e6 - e6. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Qiu, K. Fujita, Y. Sakuma, T. Tanaka, Y. Ohashi, H. Ohshima, M. Tomita, and M. Itaya Comparative Analysis of Physical Maps of Four Bacillus subtilis (natto) Genomes Appl. Envir. Microbiol., October 1, 2004; 70(10): 6247 - 6256. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. C. Samuels, R. J. Boys, D. A. Henderson, and P. F. Chinnery A compositional segmentation of the human mitochondrial genome is related to heterogeneities in the guanine mutation rate Nucleic Acids Res., October 15, 2003; 31(20): 6043 - 6052. [Abstract] [Full Text] [PDF] |
||||




