Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (20K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (17)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Riley, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Riley, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© 1997 Oxford University Press 51-52

Footnote

Genes and proteins of Escherichia coli K-12 (GenProtEC)

Genes and proteins of Escherichia coli K-12 (GenProtEC) Monica Riley

Marine Biology Laboratory, Woods Hole , MA 02540, USA

Received October 16, 1996; Accepted October 22, 1996

ABSTRACT

GenProtEC is a database of Escherichia coli genes and their gene products, classified by type of function and physiological role and with citations to the literature for each. Also present are data on sequence similarities among E.coli proteins with PAM values, percent identity of amino acids, length of alignment and percent aligned. GenProtEC can also be accessed through the World Wide Web at URL http://mbl.edu/html/ecoli.html .

GenProtEC (genes and proteins of Escherichia coli ) is a database available on the World Wide Web which centers around the products of E.coli K-12 chromosomal genes. As of October 1996, the database contained 2193 gene products whose physiological function was known to some degree, some better understood than others. In the future, as additional genes are sequenced and additional biological characterization is reported, additions and corrections to the database will continue to be made.

GenProtEC contains the gene name, its synonyms, the SwissProt ( 1 ) mnemonic for proteins when one has been assigned, its synonyms, the full gene product name, and the Enzyme Commision EC number for enzymatic reactions. Up to three literature references are supplied for each entry.

Data on physiological function and sequence similarity are also given. The 2193 gene products have been classified as to type, as either an enzyme, a regulator, RNA, part of the membrane, a member of the transport system, a protein factor, a carrier, or a part of the structure of the cell other than membrane. The gene products have been assigned to at least one or up to four of 118 hierarchically arranged categories of physiological function. Both an early version of classification of genes and gene products by function ( 2 ), and a more recent version ( 3 ) are further refined in the more up-to-date electronic database.

In addition, sequence similarity of each protein to any other E.coli protein is given, permitting the grouping together of E.coli proteins of similar amino acid sequence. The database contains the results of similarity analyses ( 4 , 5 ) that used the AllAllDB of the Darwin suite at Zurich ( 6 ), requiring an alignment of at least 100 amino acids and a PAM score (accepted point mutations) ( 7 ) of <200. Altogether 1347 of E.coli K-12 chromosomally encoded proteins had at least one E.coli protein partner with sequence similarity as defined above. Proteins with more than one domain >100 amino acids were divided, each domain treated separately. The resulting 1430 proteins/domains formed 3685 sequence-related pairs. The pairs were linked by chains of similarities into sequence-related groups. As of October 1996 there are 350 sequence-related groups of E.coli proteins, ranging in size from two to 63, and most or all members of each group are related by function as well as by sequence.

One can query GenProtEC with a gene name or a synonym or with a SwissProt name or a synonym, or with a string for description of gene product or a key for physiological category. Complete pick lists are available for each of these. The search can be refined by adding more terms in the logical relationships AND, OR, AND/OR. Information on the gene product and the function of the gene product is returned, as well as sequence similarities among E.coli proteins. For any protein that has at least one sequence-related partner, the name(s) of all other E.coli proteins in the related group are returned. For any sequence-related pair, the position and length of the alignment for each of the two proteins is given, as well as the percent of the protein aligned, the percent identical amino acids and the PAM score. As new E.coli protein sequences appear in the SwissProt database ( 1 ), information on their sequence relationships within the E.coli K-12 chromosome will be incorporated into GenProtEC.

The coupling of sequence similarity and similarity of function may continue to be useful as a guide to interpretation of the physiological role of protein sequences from other organisms whose biology is less well known than that of E.coli (e.g. ref. 8 ). In using the information for E.coli to determine what functions another organism posseses, it is important to keep in mind that protein sequence and function are not always correlated. Among 103 pairs or triplets of E.coli enzymes that catalyse the same biochemical reactions, 60% are similar in amino acid sequence as one might expect (PAM value <250), but the other 40% have little or no relationship of sequence (PAM value >250) ( 3 , 4 ). Therefore absence of one class of amino acid sequence from an organism does not tell if the corresponding function is absent or if another protein of unrelated sequence is present that might carry out the function in question.

The database can be queried directly on the World Wide Web, accessing through the URL http://mbl.edu/html/ecoli.html . Feedback and corrections will be gratefully received, and it will soon be possible to enter comments directly at the Web site. Users kindly cite this article.

ACKNOWLEDGEMENTS

Grateful thanks to David Space and David Remsen, Information Sevices Division, Marine Biological Laboratory, for invaluable programming and site design.

REFERENCES

1 Bairoch,A. and Boeckman,B. (1993) Nucleic Acids Res., 21, 3093-3096.

2 Riley,M. (1993) Microbiol. Rev., 57, 862-952.

3 Riley,M. and Labedan,B. (1996) In Curtiss,R.III, Lin,E.C.C., Ingraham,J., Low,K.B., Magasanik,B., Neidhardt,F., Reznikoff,W., Riley,M., Schaechter,M. and Umbarger,H.E., (eds), Escherichia coli and Salmonella. American Society for Microbiology, Washington, D.C., pp. 2118-2202.

4 Labedan,B. and Riley,M. (1995) Mol. Biol. Evol., 12, 980-987.

5 Labedan,B. and Riley,M. (1996) manuscript in preparation.

6 Gonnet,G.H., Cohen,M.A. and Benner,S.A. (1992) Science, 256, 1443-1445.

7 Dayhoff, M.O., Schwartz,R.M. and Orcutt,B.C. (1978) In Dayhoff,M.O. (ed.), Atlas of Protein Sequence and Structure. National Biomedical Research Foundation, Washington, D.C., Vol. 5, suppl. 3, pp. 345-358.

8 Fleischmann,R.D., Adams,M.D., White,O., Clayton,R.A., Kirkness,E.F., Kerlavage,A.R., Bult,C.J., Tomb,J.F., Dougherty,B.A., Merrick,J.M., et al. (1995) Science, 269, 496-512.


Return

Tel: +1 508 289 7612; Fax: +1 508 540 6902; Email: mriley@mbl.edu
Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
K. Bryson, V. Loux, R. Bossy, P. Nicolas, S. Chaillou, M. van de Guchte, S. Penaud, E. Maguin, M. Hoebeke, P. Bessieres, et al.
AGMIAL: implementing an annotation strategy for prokaryote genomes as a distributed system
Nucleic Acids Res., July 19, 2006; 34(12): 3533 - 3545.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
N. T. Liberati, J. M. Urbach, S. Miyata, D. G. Lee, E. Drenkard, G. Wu, J. Villanueva, T. Wei, and F. M. Ausubel
An ordered, nonredundant library of Pseudomonas aeruginosa strain PA14 transposon insertion mutants
PNAS, February 21, 2006; 103(8): 2833 - 2838.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. A. Bernstein, P.-H. Lin, S. N. Cohen, and S. Lin-Chao
Global analysis of Escherichia coli RNA degradosome function using DNA microarrays
PNAS, March 2, 2004; 101(9): 2758 - 2763.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
J. A. Bernstein, A. B. Khodursky, P.-H. Lin, S. Lin-Chao, and S. N. Cohen
Global analysis of mRNA decay and abundance in Escherichia coli at single-gene resolution using two-color fluorescent DNA microarrays
PNAS, July 23, 2002; 99(15): 9697 - 9702.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
H. Ogata, W. Fujibuchi, S. Goto, and M. Kanehisa
A heuristic graph comparison algorithm and its application to detect functionally related enzyme clusters
Nucleic Acids Res., October 15, 2000; 28(20): 4021 - 4028.
[Abstract] [Full Text] [PDF]


Home page
ScienceHome page
F. R. Blattner, G. Plunkett III, C. A. Bloch, N. T. Perna, V. Burland, M. Riley, J. Collado-Vides, J. D. Glasner, C. K. Rode, G. F. Mayhew, et al.
The Complete Genome Sequence of Escherichia coli K-12
Science, September 5, 1997; 277(5331): 1453 - 1462.
[Abstract] [Full Text]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (20K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (17)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Riley, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Riley, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?