| Nucleic Acids Research | Pages |
Genes and Proteins of Escherichia coli K-12 (GenProtEC)
Acknowledgements
References
Genes and Proteins of Escherichia coli K-12 (GenProtEC)
ABSTRACT
GenProtEC (Genes and Proteins of E.coli) is a database available on the World Wide Web which centers around the products of Escherichia coli K-12 chromosomal genes. The database contains a listing of 4403 genes, 2086 gene products whose physiological function is known to some degree empirically, some better understood than others, 1244 open reading frams (ORFs) whose function can be predicted by sequence similarity with known proteins and 1073 ORFs of unknown function. The data base contains the gene name, its synonyms, the SwissProt (1) mnemonic for proteins when one has been assigned, its synonyms, the full gene product name, and the Enzyme Commision EC number for enzymatic reactions. Up to three literature references are supplied for each entry.
Data on physiological function and sequence similarity are also given. Gene products have been classified as to type, as either an enzyme, a regulator, RNA, part of the membrane, a member of the transport system, a protein factor, a carrier, or a part of the structure of the cell other than membrane. The gene products have been assigned to at least one or up to four of 118 hierarchically arranged categories of physiological function (2,3).
Sequence similarity of each protein to any other E.coli protein is given, permitting the grouping together of E.coli proteins of similar amino acid sequence. The database contains the results of similarity analyses carried out in collaboration with Bernard Labedan (4,5). using the AllAllDB of the Darwin suite at Zurich (6), requiring an alignment of at least 100 amino acids and a PAM score (accepted point mutations) (7) of <200. Almost half of E.coli K-12 chromosomally encoded proteins had at least one E.coli protein partner with sequence similarity as defined above. Some proteins were essentially fusions of two or more independent proteins that we term `modules'. To avoid artifact and error, proteins consisting of two or more modules >100 amino acids each were divided, so that each module was treated separately. The resulting 2149 proteins/domains formed 7161 sequence-related pairs. The pairs were linked by chains of similarities into sequence-related groups. There are 602 sequence-related groups of E.coli proteins, ranging in size from 2 to 129, and most or all members of each group are related by function as well as by sequence.
One can query GenProtEC with a gene name or a synonym or with a SWISS-PROT name or a synonym, or with a string for description of gene product or a key for physiological category. Complete pick lists are available for each of these. Information on the gene product and the function of the gene product is returned, as well as sequence similarities among E.coli proteins. For any protein that has at least one sequence-related partner, the name(s) of all other E.coli proteins in the related group are returned. For any sequence-related pair, the position and length of the alignment for each of the two proteins is given, as well as the percent of the protein aligned, the percent identical amino acids and the PAM score.
The database can be queried directly on the World Wide Web, accessing through the URL http://www.mbl.edu/html/ecoli.html. Feedback and corrections will be gratefully received. Users are requested to kindly cite this article.
ACKNOWLEDGEMENTS
Grateful thanks to David Space and David Remsen, Information Sevices Division, Marine Biological Laboratory, for invaluable programming and site design.
REFERENCES
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals Comments and feedback: www-admin{at}oup.co.uk
Last modification: 16 Dec 1997
Copyright© Oxford University Press, 1998.
This article has been cited by other articles:
![]() |
G. N. Vemuri, E. Altman, D. P. Sangurdekar, A. B. Khodursky, and M. A. Eiteman Overflow Metabolism in Escherichia coli during Steady-State Growth: Transcriptional Regulation and Effect of the Redox Ratio. Appl. Envir. Microbiol., May 1, 2006; 72(5): 3653 - 3661. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Marino-Ramirez, J. L. Minor, N. Reading, and J. C. Hu Identification and Mapping of Self-Assembling Protein Domains Encoded by the Escherichia coli K-12 Genome by Use of {lambda} Repressor Fusions J. Bacteriol., March 1, 2004; 186(5): 1311 - 1319. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Matte, J. Sivaraman, I. Ekiel, K. Gehring, Z. Jia, and M. Cygler Contribution of Structural Genomics to Understanding the Biology of Escherichia coli J. Bacteriol., July 15, 2003; 185(14): 3994 - 4002. [Full Text] [PDF] |
||||
![]() |
J. D. Glasner, P. Liss, G. Plunkett III, A. Darling, T. Prasad, M. Rusch, A. Byrnes, M. Gilson, B. Biehl, F. R. Blattner, et al. ASAP, a systematic annotation package for community analysis of genomes Nucleic Acids Res., January 1, 2003; 31(1): 147 - 151. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. G. Berg and C. G. Kurland Evolution of Microbial Genomes: Sequence Acquisition and Loss Mol. Biol. Evol., December 1, 2002; 19(12): 2265 - 2276. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Weber and K. Jung Profiling Early Osmostress-Dependent Gene Expression in Escherichia coli Using DNA Macroarrays J. Bacteriol., October 1, 2002; 184(19): 5502 - 5507. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Jardine, J. Gough, C. Chothia, and S. A. Teichmann Comparison of the Small Molecule Metabolic Enzymes of Escherichia coli and Saccharomyces cerevisiae Genome Res., June 1, 2002; 12(6): 916 - 929. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. M. Ursing, F. H. J. van Enckevort, J. A. M. Leunissen, and R. J. Siezen EXProt: a database for proteins with an experimentally verified function Nucleic Acids Res., January 1, 2002; 30(1): 50 - 51. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. J. Pomposiello, M. H. J. Bennik, and B. Demple Genome-Wide Transcriptional Profiling of the Escherichia coli Responses to Superoxide Stress and Sodium Salicylate J. Bacteriol., July 1, 2001; 183(13): 3890 - 3902. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Florea, C. Riemer, S. Schwartz, Z. Zhang, N. Stojanovic, W. Miller, and M. McClelland Web-based visualization tools for bacterial genome alignments Nucleic Acids Res., September 15, 2000; 28(18): 3486 - 3496. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Tao, C. Bausch, C. Richmond, F. R. Blattner, and T. Conway Functional Genomics: Expression Analysis of Escherichia coli Growing on Minimal and Rich Media J. Bacteriol., October 15, 1999; 181(20): 6425 - 6440. [Abstract] [Full Text] |
||||
![]() |
C. S. Richmond, J. D. Glasner, R. Mau, H. Jin, and F. R. Blattner Genome-wide expression profiling in Escherichia coli K-12 Nucleic Acids Res., October 1, 1999; 27(19): 3821 - 3835. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Pellegrini, E. M. Marcotte, M. J. Thompson, D. Eisenberg, and T. O. Yeates Assigning protein functions by comparative genome analysis: Protein phylogenetic profiles PNAS, April 13, 1999; 96(8): 4285 - 4288. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Salgado, G. Moreno-Hagelsieb, T. F. Smith, and J. Collado-Vides Operons in Escherichia coli: Genomic analyses and predictions PNAS, June 6, 2000; 97(12): 6652 - 6657. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





