Skip Navigation

This Article
Right arrow Print PDF (1071K)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (65)
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Borodovsky, M.
Right arrow Articles by Koonin, E. V.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Borodovsky, M.
Right arrow Articles by Koonin, E. V.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 1994, Vol. 22, No. 22 4756-4767
© 1994


COMPUTATIONAL BIOLOGY

Intrinsic and extrinsic approaches for detecting genes in a bacterial genome

Mark Borodovsky, Kenneth E. Rudd1 and Eugene V. Koonin1

School of Biology, Georgia Institute of Techonology Atlanta, GA 30332-0230 1National Center for Biotechnology Information, National Library of Medicine, National Institute of Health Bethesda, MD 20894, USA

Received June 21, 1994. Revised September 28, 1994. Accepted September 28, 1994.

The unannotated regions of the Escherichia coli genome DNA sequence from the EcoSeq6 database, totaling 1,278 ‘intergenic’ sequences of the combined length of 359,279 basepairs, were analyzed using computer-assisted methods with the aim of identifying putative unknown genes. The proposed strategy for finding new genes includes two key elements: l) prediction of expressed open reading frames (ORFs) using the GeneMark method based on Markov chain models for coding and non-coding regions of Escherichia coli DNA, and ii) search for protein sequence similarities using programs based on the BLAST algorithm and programs for motif identification. A total of 354 putative expressed ORFs were predicted by GeneMark. Using the BLASTX and TBLASTN programs, it was shown that 208 ORFs located in the unannotated regions of the E.coli chromosome are significantly similar to other protein sequences. Identification of 182 ORFs as probable genes was supported by both GeneMark and BLAST, comprising 51.4% of the GeneMark ‘hits’ and 87.5% of the BLAST ‘hits’. 73 putative new genes, comprising 20.6% of the GeneMark predictions, belong to ancient conserved protein families that include both eubacterial and eukaryotic members. This value is close to the overall proportion of highly conserved sequences among eubacterial proteins, indicating that the majority of the putative expressed ORFs that are predicted by GeneMark, but have no significant BLAST hits, nevertheless are likely to be real genes. The majority of the putative genes identified by BLAST search have been described since the release of the EcoSeq6 database, but about 70 genes have not been detected so far. Among these new identifications are genes encoding proteins with a variety of predicted functions including dehydrogenases, kinases, several other metabolic enzymes, ATPases, rRNA methyltransferases, membrane proteins, and different types of regulatory proteins.


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
MicrobiologyHome page
M. Ibrahim, P. Nicolas, P. Bessieres, A. Bolotin, V. Monnet, and R. Gardan
A genome-wide survey of short coding sequences in streptococci
Microbiology, November 1, 2007; 153(11): 3631 - 3644.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
M. R. G. Hodskinson, L. M. Allen, D. P. Thomson, and J. R. Sayers
Molecular interactions of Escherichia coli ExoIX and identification of its associated 3'-5' exonuclease activity
Nucleic Acids Res., June 12, 2007; (2007) gkm396v1.
[Abstract] [Full Text] [PDF]


Home page
J BiochemHome page
A. Tadokoro, H. Hayashi, T. Kishimoto, Y. Makino, S. Fujisaki, and Y. Nishimura
Interaction of the Escherichia coli Lipoprotein NlpI with Periplasmic Prc (Tsp) Protease
J. Biochem., February 1, 2004; 135(2): 185 - 191.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
R. Mills, M. Rozanov, A. Lomsadze, T. Tatusova, and M. Borodovsky
Improving gene annotation of complete viral genomes
Nucleic Acids Res., December 1, 2003; 31(23): 7041 - 7055.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
C. Mathe, M.-F. Sagot, T. Schiex, and P. Rouze
Current methods of gene prediction, their strengths and weaknesses
Nucleic Acids Res., October 1, 2002; 30(19): 4103 - 4117.
[Abstract] [Full Text] [PDF]


Home page
Genome Res.Home page
M. Hirosawa, K.-i. Ishikawa, T. Nagase, and O. Ohara
Detection of Spurious Interruptions of Protein-Coding Regions in Cloned cDNA Sequences by GeneMark Analysis
Genome Res., September 1, 2000; 10(9): 1333 - 1341.
[Abstract] [Full Text]


Home page
J. Bacteriol.Home page
D. H. Schmiel, G. M. Young, and V. L. Miller
The Yersinia enterocolitica Phospholipase Gene yplA Is Part of the Flagellar Regulon
J. Bacteriol., April 15, 2000; 182(8): 2314 - 2320.
[Abstract] [Full Text]


Home page
Mol. Cell. Biol.Home page
J. A. Solinger, D. Pascolini, and W.-D. Heyer
Active-Site Mutations in the Xrn1p Exoribonuclease of Saccharomyces cerevisiae Reveal a Specific Role in Meiosis
Mol. Cell. Biol., September 1, 1999; 19(9): 5930 - 5942.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
M. Ohara, H. C. Wu, K. Sankaran, and P. D. Rick
Identification and Characterization of a New Lipoprotein, NlpI, in Escherichia coli K-12
J. Bacteriol., July 15, 1999; 181(14): 4318 - 4325.
[Abstract] [Full Text]


Home page
J. Bacteriol.Home page
P. K. Martin, T. Li, D. Sun, D. P. Biek, and M. B. Schmid
Role in Cell Permeability of an Essential Two-Component System in Staphylococcus aureus
J. Bacteriol., June 15, 1999; 181(12): 3666 - 3673.
[Abstract] [Full Text]


Home page
J. Bacteriol.Home page
D. Blankenhorn, J. Phillips, and J. L. Slonczewski
Acid- and Base-Induced Proteins during Aerobic and Anaerobic Growth of Escherichia coli Revealed by Two-Dimensional Gel Electrophoresis
J. Bacteriol., April 1, 1999; 181(7): 2209 - 2216.
[Abstract] [Full Text]


Home page
Proc. Natl. Acad. Sci. USAHome page
S. Censini, C. Lange, Z. Xiang, J. E. Crabtree, P. Ghiara, M. Borodovsky, R. Rappuoli, and A. Covacci
cag, a pathogenicity island of Helicobacter pylori, encodes type I-specific and disease-associated virulence factors
PNAS, December 10, 1996; 93(25): 14648 - 14653.
[Abstract] [Full Text] [PDF]



Disclaimer:
Please note that abstracts for content published before 1996 were created through digital scanning and may therefore not exactly replicate the text of the original print issues. All efforts have been made to ensure accuracy, but the Publisher will not be held responsible for any remaining inaccuracies. If you require any further clarification, please contact our Customer Services Department.