Nucleic Acids Research, 2000, Vol. 28, No. 1 33-36
© 2000 Oxford University Press
The COG database: a tool for genome-scale analysis of protein functions and evolution
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health, Bethesda, MD 20894, USA
Received September 3, 1999; Revised September 25, 1999; Accepted October 4, 1999.
| ABSTRACT |
|---|
|
|
|---|
Rational classification of proteins encoded in sequenced genomes is critical for making the genome sequences maximally useful for functional and evolutionary studies. The database of Clusters of Orthologous Groups of proteins (COGs) is an attempt on a phylogenetic classification of the proteins encoded in 21 complete genomes of bacteria, archaea and eukaryotes (http://www.ncbi.nlm.nih.gov/COG ). The COGs were constructed by applying the criterion of consistency of genome-specific best hits to the results of an exhaustive comparison of all protein sequences from these genomes. The database comprises 2091 COGs that include 5683% of the gene products from each of the complete bacterial and archaeal genomes and ~35% of those from the yeast Saccharomyces cerevisiae genome. The COG database is accompanied by the COGNITOR program that is used to fit new proteins into the COGs and can be applied to functional and phylogenetic annotation of newly sequenced genomes.
| INTRODUCTION |
|---|
|
|
|---|
The recent progress in genome sequencing has led to a rapid enrichment of protein databases with an unprecedented variety of deduced protein sequences, most of them without a documented functional role. Computational biology strives to extract the maximal possible information from these sequences by classifying them according to their homologous relationships, predicting their likely biochemical activities and/or cellular functions, three-dimensional structures and evolutionary origin. This challenge is daunting, given that even in Escherichia coli, arguably the best-studied organism (1), only ~40% of the gene products have been characterized experimentally (2). On the other hand, computer analysis of complete microbial genomes has shown that prokaryotic proteins are in general highly conserved, with ~70% of them containing ancient conserved regions (ACRs) (3). This allows one to transfer functional information from experimentally characterized proteins to their homologs from poorly studied organisms. For such functional predictions to be reliable, it is critical to infer orthologous relationships between genes from different species. Orthologs are direct evolutionary counterparts related by vertical descent as opposed to paralogs which are genes within the same genome related by duplication (4,5). Typically, orthologous proteins have the same domain architecture and the same function although there are significant exceptions and complications to this generalization, particularly among multicellular eukaryotes (6).
The Clusters of Orthologous Groups of proteins (COGs) database has been designed as an attempt to classify proteins from completely sequenced genomes on the basis of the orthology concept (7). The COGs reflect one-to-many and many-to-many orthologous relationships as well as simple one-to-one relationships (hence Orthologous Groups of proteins). The original set included the proteins from five bacterial, one archaeal and one eukaryotic genomes and consisted of 720 COGs; subsequently, a sixth bacterial genome was added, with the number of COGs increasing to 860 (8). Here we report the current status of the COG database which now consists of 2091 COGs and includes proteins from 21 complete genomes.
| CONSTRUCTION OF THE COGs |
|---|
|
|
|---|
COGs have been identified on the basis of an all-against-all sequence comparison of the proteins encoded in complete genomes using the gapped BLAST program (9) after masking low-complexity and predicted coiled-coil regions (7). The COG construction procedure is based on the simple notion that any group of at least three proteins from distant genomes that are more similar to each other than they are to any other proteins from the same genomes are most likely to belong to an orthologous family. This prediction holds even if the absolute level of sequence similarity between the proteins in question is relatively low and thus the COG approach accommodates both slow-evolving and fast-evolving genes. Briefly, COG construction includes the following steps.
1. Perform the all-against-all protein sequence comparison.
2. Detect and collapse obvious paralogs, that is, proteins from the same genome that are more similar to each other than to any proteins from other species.
3. Detect triangles of mutually consistent, genome-specific best hits (BeTs), taking into account the paralogous groups detected at step 2.
4. Merge triangles with a common side to form COGs.
5. A case-by-case analysis of each COG. This analysis serves to eliminate false-positives and to identify groups that contain multidomain proteins by examining the pictorial representation of the BLAST search outputs. The sequences of detected multidomain proteins are split into single-domain segments and steps 14 are repeated with these sequences, which results in the assignment of individual domains to COGs in accordance with their distinct evolutionary affinities.
6. Examination of large COGs that include multiple members from all or several of the genomes using phylogenetic trees, cluster analysis and visual inspection of alignments; as a result, some of these groups are split into two or more smaller ones that are included in the final set of COGs.
By the design of this procedure, a minimal COG includes three genes from distinct phylogenetic lineages (protein sets from closely related species, such as, for example, Mycoplasma genitalium and Mycoplasma pneumoniae were merged prior to COG construction). The approach used for the construction of COGs does not supplant a comprehensive phylogenetic analysis. Nevertheless, it provides a fast and convenient short-cut to delineate a large number of families that most likely consist of orthologs.
Once the COGs have been identified using the above procedure, new members can be added using the COGNITOR program that is based on the same idea of the consistency between genome-specific best hits. If a protein sequence, when compared to the COG database, gives two or more best hits into the given COG, the protein in question is a candidate member of the COG.
To create the current set of COGs, the COGNITOR program was used to fit the protein sets from 12 complete bacterial and archaeal genomes into the 860 previously delineated COGs. The candidate COG members identified using the two-best-hit approach were further evaluated by a case-by-case examination of sequence alignments to verify significance of the relationships and the conservation of salient features of the proteins in the COGs, such as domain architecture and active centers of enzymes. Those of the proteins from the 12 new genomes that could not be included in the pre-existing COGs were analyzed using the original procedure for COG construction. The newly formed COGs were combined with the pre-existing ones to form the updated COG collection.
| STATISTICS OF THE COG DATABASE |
|---|
|
|
|---|
1252 COGs (~60%) are simple families, with no paralogs or with paralogs from one lineage only. These are unlikely to undergo modifications as a result of further analysis and/or accumulation of new genomic data (but new genomes will most likely add to these simple COGs) and in most, if not all, cases allow a straightforward transfer of functional information from functionally characterized genes from model systems, such as E.coli and yeast, to those from poorly characterized genomes. The remaining COGs contain paralogs from more than one species and, accordingly, may include evolutionarily and functionally distinct subgroups. Some of these subgroups may become separate COGs with further accumulation of genomic data. Furthermore, for some of the largest groups included in the COG set, such as, for example, families of DNA and RNA helicases or SAM-dependent methyltransferases, establishing true orthologous relationships is extremely difficult. These COGs include experimentally characterized proteins with similar biochemical activity (e.g., methyltransferase) but possess different functions (e.g., transfer methyl groups to different substrates). Thus only very general functional predictions are possible for poorly characterized members of such COGs.
The fraction of the proteins that belong to the COGs and thus represent ancient families conserved across a wide phylogenetic range is between 56 and 83% for the bacterial and archaeal genomes, with an average of 67% (Fig. 1). Notably, this value is close to 70%, the previous estimate of the proteins encoded in each genome that contains ancient conserved regions (3). Aquifex aeolicus, which has the smallest genome among the sequenced free-living prokaryotes, is most completely represented in the COGs, which may reflect the preferential use of highly conserved proteins for house-keeping functions, whereas specialized parasitic bacteria, such as Mycobacterium or Borrelia, are relatively poorly represented (Fig. 1). The fraction of the yeast proteins currently included in the COGs is much lower than for any of the prokaryotes (Fig. 1), indicating the prevalence of eukaryote-specific families.
|
The COGs were classified into 17 functional categories that loosely follow those introduced by Riley (10) and also include a class for which only a general functional prediction (e.g., that of biochemical activity) was feasible as well as a class of uncharacterized COGs. A significant majority of the COGs could be assigned to one of the well-defined functional categories but the measure of our ignorance is apparent from the fact that the single largest category is the functionally uncharacterized COGs (Fig. 2).
|
In the original COG analysis, we introduced the notion of a phylogenetic pattern, i.e., the pattern of species that are represented or not represented in a given COG; alternatively, phylogenetic patterns can be described in terms of the sets of COGs that are represented in a given range of species. A broad diversity of phylogenetic patterns has become immediately apparent. This conclusion was reinforced by the analysis of the new data set, which includes only a small fraction of universal COGs, whereas COGs represented only in three or four species are most abundant (Fig. 3). This patchy distribution of phylogenetic patterns is likely to reflect the major role of horizontal gene transfer and lineage-specific gene loss in the evolution of prokaryotes (11) as well as rapid evolution of certain genes in specific lineages, which is probably linked to functional changes.
|
| APPLICATIONS OF THE COGs |
|---|
|
|
|---|
The most straightforward application of the COGs is for the prediction of functions of individual proteins or protein sets, including those from newly completed genomes. This is done by fitting proteins into the COG using the COGNITOR program. Given that with the increase of the number of genomes included in the COGs, the likelihood of two BeTs for the given protein falling into the same COG by chance also increases, the current cut-off for assigning proteins to COGs is set at three BeTs. The user can increase the stringency of the analysis by resetting the cut-off at a greater number of BeTs. The requirement of multiple BeTs for a protein to be assigned to a COG, to some extent, serves as a safeguard against the propagation of errors that might be present in the COGs database. Indeed, if a COG contains one or even two false-positives, this will not result in a false assignment by COGNITOR under the three-BeT cut-off. It should be noted that the interpretation of COGNITOR results for COGs containing paralogs (see above) requires caution to avoid overly specific functional predictions.
The COGs also provide opportunities for more sophisticated queries. In particular, it is possible to systematically identify those conserved families (COGs) that are missing in a given genome. This information can be utilized to either detect the respective genes that might have been missed during genome annotation or to search for an alternative cognate of the given function among the gene products. The COG WWW site (see below) offers automatic means to isolate all COGs with a particular phylogenetic pattern, for example those that are found only in pathogenic bacteria. This effectively provides the functionality of differential genome display (12) and can be helpful for delineating sets of candidate proteins for a particular range of functional features, e.g., virulence or hyperthermophily. More generally, the COG system is a convenient platform for a variety of evolutionary-oriented analyses of protein families.
| THE COG WWW SITE, DATA PRESENTATION AND AVAILABILITY |
|---|
|
|
|---|
The COG WWW site (http://www.ncbi.nlm.nih.gov/COG ) contains the following principal types of data: (i) list of all COGs organized by the (predicted) functional category and hyperlinked to (ii) individual COG pages. Each of the COG pages shows the respective phylogenetic pattern and is hyperlinked to: 1) pictorial representations of BLAST search outputs for each member of the COG, which also includes links to the respective GenBank and Entrez-Genomes entries, 2) a multiple alignment of the COG members produced automatically using the ClustalW program (13), and 3) a cluster dendrogram generated using the BLAST scores as the measure of similarity between proteins; (iii) the COGNITOR page where a protein sequence can be pasted, searched against the database of proteins from complete genomes and assigned to a COG as described above; (iv) a phylogenetic pattern search tool; (v) a matrix of co-occurrence of genomes in COGs. The COG data set and the COGNITOR program also are available by anonymous ftp at ftp://ncbi.nlm.nih.gov/pub/COG
| ACKNOWLEDGEMENTS |
|---|
We are grateful to David Lipman for his critical contribution at the initial stage of the COG project and constant support and inspiration, and to Nick Grishin, Jim Ostell, Tatiana Tatusov and Yuri Wolf for helpful discussions.
| FOOTNOTES |
|---|
* To whom correspondence should be addressed. Tel: +1 301 435 5913; Fax: +1 301 480 9241; Email: koonin@ncbi.nlm.nih.gov
| REFERENCES |
|---|
|
|
|---|
-
1 Neidhardt,F.C., Curtiss,R.,III, Ingraham,J.L., Lin,E.C.C., Low,K.B., Magasanik,B., Reznikoff,W.S., Riley,M., Schaechter,M. and Umbarger,H.E. (eds) (1996) Escherichia coli and Salmonella. Cellular and Molecular Biology, 2nd Edn. ASM Press, Washington, DC.
2 Koonin,E.V. (1997) Curr. Biol., 7, R656R659.[Web of Science][Medline]
3 Koonin,E.V., Mushegian,A.R., Galperin,M.Y. and Walker,D.R. (1997) Mol. Microbiol., 25, 619637.[Web of Science][Medline]
4 Fitch,W.M. (1970) System. Zool., 19, 99106.
5 Fitch,W.M. (1995) Phil. Trans. R. Soc. Lond. B Biol. Sci., 349, 93102.[Web of Science][Medline]
6 Henikoff,S., Greene,E.A., Pietrokovski,S., Bork,P., Attwood,T.K. and Hood,L. (1997) Science, 278, 609614.
7 Tatusov,R.L., Koonin,E.V. and Lipman,D.J. (1997) Science, 278, 631637.
8 Koonin,E.V., Tatusov,R.L. and Galperin,M.Y. (1998) Curr. Opin. Struct. Biol., 8, 355363.[Web of Science][Medline]
9 Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Nucleic Acids Res., 25, 33893402.
10 Riley,M. (1993) Microbiol. Rev., 57, 862952.
11 Doolittle,W.F. (1999) Science, 284, 21242129.
12 Huynen,M.A., Diaz-Lazcoz,Y. and Bork,P. (1997) Trends Genet., 13, 389390.[Web of Science][Medline]
13 Thompson,J.D., Higgins,D.G. and Gibson,T.J. (1994) Nucleic Acids Res., 22, 46734680.
This article has been cited by other articles:
![]() |
Y. Zhang and V. N. Gladyshev General Trends in Trace Element Utilization Revealed by Comparative Genomic Analyses of Co, Cu, Mo, Ni, and Se J. Biol. Chem., January 29, 2010; 285(5): 3393 - 3405. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Pornwiroon, A. Bourchookarn, C. D. Paddock, and K. R. Macaluso Proteomic Analysis of Rickettsia parkeri Strain Portsmouth Infect. Immun., December 1, 2009; 77(12): 5262 - 5271. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Kuchta, L. Knizewski, L. S. Wyrwicz, L. Rychlewski, and K. Ginalski Comprehensive classification of nucleotidyltransferase fold proteins: identification of novel families and their representatives in human Nucleic Acids Res., December 1, 2009; 37(22): 7701 - 7714. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Hillmann, C. Doring, O. Riebe, A. Ehrenreich, R.-J. Fischer, and H. Bahl The Role of PerR in O2-Affected Gene Expression of Clostridium acetobutylicum J. Bacteriol., October 1, 2009; 191(19): 6082 - 6093. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. V. Hankins and M. S. Trent Secondary Acylation of Vibrio cholerae Lipopolysaccharide Requires Phosphorylation of Kdo J. Biol. Chem., September 18, 2009; 284(38): 25804 - 25812. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Sennblad and J. Lagergren Probabilistic Orthology Analysis Syst Biol, August 18, 2009; (2009) syp046v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sinha, A. D. S. Cameron, and R. J. Redfield Sxy Induces a CRP-S Regulon in Escherichia coli J. Bacteriol., August 15, 2009; 191(16): 5180 - 5195. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Hao and G. B. Golding Does Gene Translocation Accelerate the Evolution of Laterally Transferred Genes? Genetics, August 1, 2009; 182(4): 1365 - 1375. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Barrangou, E. P. Briczinski, L. L. Traeger, J. R. Loquasto, M. Richards, P. Horvath, A.-C. Coute-Monvoisin, G. Leyer, S. Rendulic, J. L. Steele, et al. Comparison of the Complete Genome Sequences of Bifidobacterium animalis subsp. lactis DSM 10140 and Bl-04 J. Bacteriol., July 1, 2009; 191(13): 4144 - 4151. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Santos, B. Teusink, D. Molenaar, M. van Heck, M. Wels, S. Sieuwerts, W. M. de Vos, and J. Hugenholtz Effect of Amino Acid Availability on Vitamin B12 Production in Lactobacillus reuteri Appl. Envir. Microbiol., June 15, 2009; 75(12): 3930 - 3936. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. C. Karst, A.-E. Foucher, T. L. Campbell, A.-M. Di Guilmi, D. Stroebel, C. S. Mangat, E. D. Brown, and J.-M. Jault The ATPase activity of an 'essential' Bacillus subtilis enzyme, YdiB, is required for its cellular function and is modulated by oligomerization Microbiology, March 1, 2009; 155(3): 944 - 956. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. T. Le, B. C. Lee, S. M. Marino, Y. Zhang, D. E. Fomenko, A. Kaya, E. Hacioglu, G.-H. Kwak, A. Koc, H.-Y. Kim, et al. Functional Analysis of Free Methionine-R-sulfoxide Reductase from Saccharomyces cerevisiae J. Biol. Chem., February 13, 2009; 284(7): 4354 - 4364. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Gao, R. Mohan, and R. S. Gupta Phylogenomics and protein signatures elucidating the evolutionary relationships among the Gammaproteobacteria Int J Syst Evol Microbiol, February 1, 2009; 59(2): 234 - 247. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Barreiro, D. Nakunst, A. T. Huser, H. D. de Paz, J. Kalinowski, and J. F. Martin Microarray studies reveal a 'differential response' to moderate or severe heat shock of the HrcA- and HspR-dependent systems in Corynebacterium glutamicum Microbiology, February 1, 2009; 155(2): 359 - 372. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. L. Winsor, T. Van Rossum, R. Lo, B. Khaira, M. D. Whiteside, R. E. W. Hancock, and F. S. L. Brinkman Pseudomonas Genome Database: facilitating user-friendly, comprehensive comparisons of microbial genomes Nucleic Acids Res., January 1, 2009; 37(suppl_1): D483 - D488. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. M. McShan, J. J. Ferretti, T. Karasawa, A. N. Suvorov, S. Lin, B. Qin, H. Jia, S. Kenton, F. Najar, H. Wu, et al. Genome Sequence of a Nephritogenic and Highly Transformable M49 Strain of Streptococcus pyogenes J. Bacteriol., December 1, 2008; 190(23): 7773 - 7785. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. M. Stead, A. Beasley, R. J. Cotter, and M. S. Trent Deciphering the Unusual Acylation Pattern of Helicobacter pylori Lipid A J. Bacteriol., November 1, 2008; 190(21): 7012 - 7021. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. Law, J. N. R. Hamlin, A. Sivro, S. J. McCorrister, G. A. Cardama, and S. T. Cardona A Functional Phenylacetic Acid Catabolic Pathway Is Required for Full Pathogenicity of Burkholderia cenocepacia in the Caenorhabditis elegans Host Model J. Bacteriol., November 1, 2008; 190(21): 7209 - 7218. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. Rigden and M. Y. Galperin Sequence analysis of GerM and SpoVS, uncharacterized bacterial 'sporulation' proteins with widespread phylogenetic distribution Bioinformatics, August 15, 2008; 24(16): 1793 - 1797. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Yutin, K. S. Makarova, S. L. Mekhedov, Y. I. Wolf, and E. V. Koonin The Deep Archaeal Roots of Eukaryotes Mol. Biol. Evol., August 1, 2008; 25(8): 1619 - 1630. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. E. Dutilh, B. Snel, T. J. G. Ettema, and M. A. Huynen Signature Genes as a Phylogenomic Tool Mol. Biol. Evol., August 1, 2008; 25(8): 1659 - 1667. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Dagan, Y. Artzy-Randrup, and W. Martin Modular networks and cumulative impact of lateral transfer in prokaryote genome evolution PNAS, July 22, 2008; 105(29): 10039 - 10044. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. M. Overton, C. A. J. van Niekerk, L. G. Carter, A. Dawson, D. M. A. Martin, S. Cameron, S. A. McMahon, M. F. White, W. N. Hunter, J. H. Naismith, et al. TarO: a target optimisation system for structural biology Nucleic Acids Res., July 1, 2008; 36(suppl_2): W190 - W196. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. E. Dutilh, Y. He, M. L. Hekkelman, and M. A. Huynen Signature, a web server for taxonomic characterization of sequence samples using signature genes Nucleic Acids Res., July 1, 2008; 36(suppl_2): W470 - W474. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Roovers, K. H. Kaminska, K. L. Tkaczuk, D. Gigot, L. Droogmans, and J. M. Bujnicki The YqfN protein of Bacillus subtilis is the tRNA: m1A22 methyltransferase (TrmK) Nucleic Acids Res., June 1, 2008; 36(10): 3252 - 3262. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Rodionov, X. Li, I. A. Rodionova, C. Yang, L. Sorci, E. Dervyn, D. Martynowski, H. Zhang, M. S. Gelfand, and A. L. Osterman Transcriptional regulation of NAD metabolism in bacteria: genomic reconstruction of NiaR (YrxA) regulon Nucleic Acids Res., April 1, 2008; 36(6): 2032 - 2046. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Rodionov, J. De Ingeniis, C. Mancini, F. Cimadamore, H. Zhang, A. L. Osterman, and N. Raffaelli Transcriptional regulation of NAD metabolism in bacteria: NrtR family of Nudix-related regulators Nucleic Acids Res., April 1, 2008; 36(6): 2047 - 2059. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. M. Lauro, K. Tran, A. Vezzi, N. Vitulo, G. Valle, and D. H. Bartlett Large-Scale Transposon Mutagenesis of Photobacterium profundum SS9 Reveals New Genetic Loci Important for Growth at Low Temperature and High Pressure J. Bacteriol., March 1, 2008; 190(5): 1699 - 1709. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. X. Cordero, B. Snel, and P. Hogeweg Coevolution of gene families in prokaryotes Genome Res., March 1, 2008; 18(3): 462 - 468. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. R Kensche, V. van Noort, B. E Dutilh, and M. A Huynen Practical and theoretical advances in predicting the function of a protein by its phylogenetic distribution J R Soc Interface, February 6, 2008; 5(19): 151 - 170. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. G. Conte, S. Gaillard, N. Lanau, M. Rouard, and C. Perin GreenPhylDB: a database for plant comparative genomics Nucleic Acids Res., January 11, 2008; 36(suppl_1): D991 - D998. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. A. Kirkland, M. A. Gil, I. M. Karadzic, and J. A. Maupin-Furlow Genetic and Proteomic Analyses of a Proteasome-Activating Nucleotidase A Mutant of the Haloarchaeon Haloferax volcanii J. Bacteriol., January 1, 2008; 190(1): 193 - 205. [Abstract] [Full Text] [PDF] |
||||
![]() |
A.D.J. van Dijk, C.J.F. ter Braak, R.G. Immink, G.C. Angenent, and R.C.H.J. van Ham Predicting and understanding transcription factor interactions based on sequence level determinants of combinatorial control Bioinformatics, January 1, 2008; 24(1): 26 - 33. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Huang, C. L. Leming, M. Suyemoto, and C. Altier Genome-Wide Screen of Salmonella Genes Expressed during Infection in Pigs, Using In Vivo Expression Technology Appl. Envir. Microbiol., December 1, 2007; 73(23): 7522 - 7530. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. L. Hiller, B. Janto, J. S. Hogg, R. Boissy, S. Yu, E. Powell, R. Keefe, N. E. Ehrlich, K. Shen, J. Hayes, et al. Comparative Genomic Analyses of Seventeen Streptococcus pneumoniae Strains: Insights into the Pneumococcal Supragenome J. Bacteriol., November 15, 2007; 189(22): 8186 - 8195. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. C. Silver, N. M. Rabinowitz, S. Kuffer, and J. Graf Identification of Aeromonas veronii Genes Required for Colonization of the Medicinal Leech, Hirudo verbana J. Bacteriol., October 1, 2007; 189(19): 6763 - 6772. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Tralau, S. Vuilleumier, C. Thibault, B. J. Campbell, C. A. Hart, and M. A. Kertesz Transcriptomic Analysis of the Sulfate Starvation Response of Pseudomonas aeruginosa J. Bacteriol., October 1, 2007; 189(19): 6743 - 6750. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Nakagawa, Y. Takaki, S. Shimamura, A.-L. Reysenbach, K. Takai, and K. Horikoshi Deep-sea vent {varepsilon}-proteobacterial genomes provide insights into emergence of pathogens PNAS, July 17, 2007; 104(29): 12146 - 12150. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. W. Udwary, L. Zeigler, R. N. Asolkar, V. Singan, A. Lapidus, W. Fenical, P. R. Jensen, and B. S. Moore Genome sequencing reveals complex secondary metabolome in the marine actinomycete Salinispora tropica PNAS, June 19, 2007; 104(25): 10376 - 10381. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. R. Kane, A. Y. Chakicherla, P. S. G. Chain, R. Schmidt, M. W. Shin, T. C. Legler, K. M. Scow, F. W. Larimer, S. M. Lucas, P. M. Richardson, et al. Whole-Genome Analysis of the Methyl tert-Butyl Ether-Degrading Beta-Proteobacterium Methylibium petroleiphilum PM1 J. Bacteriol., March 1, 2007; 189(5): 1931 - 1945. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. H. Bergman, K. D. Passalacqua, P. C. Hanna, and Z. S. Qin Operon Prediction for Sequenced Bacterial Genomes without Experimental Information Appl. Envir. Microbiol., February 1, 2007; 73(3): 846 - 854. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. R. Marri, W. Hao, and G. B. Golding Gene Gain and Gene Loss in Streptococcus: Is It Driven by Habitat? Mol. Biol. Evol., December 1, 2006; 23(12): 2379 - 2391. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Billion, R. Ghai, T. Chakraborty, and T. Hain Augur--a computational pipeline for whole genome microbial surface protein prediction and classification Bioinformatics, November 15, 2006; 22(22): 2819 - 2820. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Wu, L. A. Mueller, D. Crouzillat, V. Petiard, and S. D. Tanksley Combining Bioinformatics and Phylogenetics to Identify Large Sets of Single-Copy Orthologous Genes (COSII) for Comparative, Evolutionary and Systematic Studies: A Test Case in the Euasterid Plant Clade Genetics, November 1, 2006; 174(3): 1407 - 1420. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Comas, A. Moya, R. K. Azad, J. G. Lawrence, and F. Gonzalez-Candelas The Evolutionary Origin of Xanthomonadales Genomes and the Nature of the Horizontal Gene Transfer Process Mol. Biol. Evol., November 1, 2006; 23(11): 2049 - 2057. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Yu, L. P. Castillo, S. Mnaimneh, T. R. Hughes, and G. W. Brown A Survey of Essential Gene Function in the Yeast Cell Division Cycle Mol. Biol. Cell, November 1, 2006; 17(11): 4736 - 4747. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. E. Wyckoff, A. R. Mey, A. Leimbach, C. F. Fisher, and S. M. Payne Characterization of Ferric and Ferrous Iron Transport Systems in Vibrio cholerae J. Bacteriol., September 15, 2006; 188(18): 6515 - 6523. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Lin, L. Zhu, and D.-Y. Zhang An initial strategy for comparing proteins at the domain architecture level Bioinformatics, September 1, 2006; 22(17): 2081 - 2086. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Y. Mulkidjanian, E. V. Koonin, K. S. Makarova, S. L. Mekhedov, A. Sorokin, Y. I. Wolf, A. Dufresne, F. Partensky, H. Burd, D. Kaznadzey, et al. The cyanobacterial genome core and the origin of photosynthesis PNAS, August 29, 2006; 103(35): 13126 - 13131. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Gil, G. J. Platz, C. A. Forestal, M. Monfett, C. S. Bakshi, T. J. Sellati, M. B. Furie, J. L. Benach, and D. G. Thanassi Deletion of TolC orthologs in Francisella tularensis identifies roles in multidrug resistance and virulence PNAS, August 22, 2006; 103(34): 12897 - 12902. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. F. DeLuca, I-H. Wu, J. Pu, T. Monaghan, L. Peshkin, S. Singh, and D. P. Wall Roundup: a multi-genome repository of orthologs and evolutionary distances Bioinformatics, August 15, 2006; 22(16): 2044 - 2046. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. L. Campbell, J. Henderson, D. E. Heinrichs, and E. D. Brown The yjeQ Gene Is Required for Virulence of Staphylococcus aureus Infect. Immun., August 1, 2006; 74(8): 4918 - 4921. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Kaur, M. Pan, M. Meislin, M. T. Facciotti, R. El-Gewely, and N. S. Baliga A systems view of haloarchaeal strategies to withstand stress from transition metals Genome Res., July 1, 2006; 16(7): 841 - 854. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Ranjan, J. Seshadri, V. Vindal, S. Yellaboina, and A. Ranjan iCR: a web tool to identify conserved targets of a regulatory protein across the multiple related prokaryotic species. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W584 - W587. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Y. Galperin Structural Classification of Bacterial Response Regulators: Diversity of Output Domains and Domain Combinations J. Bacteriol., June 15, 2006; 188(12): 4169 - 4182. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Pasek, J.-L. Risler, and P. Brezellec Gene fusion/fission is a major contributor to evolution of multi-domain bacterial proteins Bioinformatics, June 15, 2006; 22(12): 1418 - 1423. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. S. Makarova, E. V. Koonin, R. Haselkorn, and M. Y. Galperin Cyanobacterial response regulator PatA contains a conserved N-terminal domain (PATAN) with an alpha-helical insertion Bioinformatics, June 1, 2006; 22(11): 1297 - 1301. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Kanjilal-Kolar, S. S. Basu, M. I. Kanipes, Z. Guan, T. A. Garrett, and C. R. H. Raetz Expression Cloning of Three Rhizobium leguminosarum Lipopolysaccharide Core Galacturonosyltransferases J. Biol. Chem., May 5, 2006; 281(18): 12865 - 12878. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. T. Cardona, C. L. Mueller, and M. A. Valvano Identification of Essential Operons with a Rhamnose-Inducible Promoter in Burkholderia cenocepacia Appl. Envir. Microbiol., April 1, 2006; 72(4): 2547 - 2555. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Jothi, E. Zotenko, A. Tasneem, and T. M. Przytycka COCO-CL: hierarchical clustering of homology relations based on evolutionary correlations Bioinformatics, April 1, 2006; 22(7): 779 - 788. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Bab-Dinitz, H. Shmuely, J. Maupin-Furlow, J. Eichler, and B. Shaanan Haloferax volcanii PitA: an example of functional interaction between the Pfam chlorite dismutase and antibiotic biosynthesis monooxygenase families? Bioinformatics, March 15, 2006; 22(6): 671 - 675. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. C. Chiu, E. K. Lee, M. G. Egan, I. N. Sarkar, G. M. Coruzzi, and R. DeSalle OrthologID: automation of genome-scale ortholog identification within a parsimony framework Bioinformatics, March 15, 2006; 22(6): 699 - 707. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Ballschmiter, O. Futterer, and W. Liebl Identification and Characterization of a Novel Intracellular Alkaline {alpha}-Amylase from the Hyperthermophilic Bacterium Thermotoga maritima MSB8 Appl. Envir. Microbiol., March 1, 2006; 72(3): 2206 - 2211. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Bandyopadhyay, R. Sharan, and T. Ideker Systematic identification of functional orthologs based on protein network comparison Genome Res., March 1, 2006; 16(3): 428 - 435. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. R. Beller, P. S. G. Chain, T. E. Letain, A. Chakicherla, F. W. Larimer, P. M. Richardson, M. A. Coleman, A. P. Wood, and D. P. Kelly The Genome Sequence of the Obligately Chemolithoautotrophic, Facultatively Anaerobic Bacterium Thiobacillus denitrificans J. Bacteriol., February 15, 2006; 188(4): 1473 - 1488. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Moreira, F. Rodriguez-Valera, and P. Lopez-Garcia Metagenomic analysis of mesopelagic Antarctic plankton reveals a novel deltaproteobacterial group Microbiology, February 1, 2006; 152(2): 505 - 517. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Snider, I. Gutsche, M. Lin, S. Baby, B. Cox, G. Butland, J. Greenblatt, A. Emili, and W. A. Houry Formation of a Distinctive Complex between the Inducible Bacterial Lysine Decarboxylase and a Novel AAA+ ATPase J. Biol. Chem., January 20, 2006; 281(3): 1532 - 1546. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Loughman and M. Caparon Regulation of SpeB in Streptococcus pyogenes by pH and NaCl: a Model for In Vivo Gene Expression J. Bacteriol., January 15, 2006; 188(2): 399 - 408. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Rodionov, P. Hebbeln, M. S. Gelfand, and T. Eitinger Comparative and Functional Genomic Analysis of Prokaryotic Nickel and Cobalt Uptake Transporters: Evidence for a Novel Group of ATP-Binding Cassette Transporters J. Bacteriol., January 1, 2006; 188(1): 317 - 327. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Du, M. R.K. S. Rao, X. Q. Chen, W. Wu, S. Mahalingam, and D. Balasundaram The Homologous Putative GTPases Grn1p from Fission Yeast and the Human GNL3L Are Required for Growth and Play a Role in Processing of Nucleolar Pre-rRNA Mol. Biol. Cell, January 1, 2006; 17(1): 460 - 474. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Reynolds, B. Collier, K. Maratou, V. Bingham, R. M. Speed, M. Taggart, C. A. Semple, N. K. Gray, and H. J. Cooke Dazl binds in vivo to specific transcripts and can regulate the pre-meiotic translation of Mvh in germ cells Hum. Mol. Genet., December 15, 2005; 14(24): 3899 - 3909. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. R. Mey, E. E. Wyckoff, V. Kanukurthy, C. R. Fisher, and S. M. Payne Iron and Fur Regulation in Vibrio cholerae and the Role of Fur in Virulence Infect. Immun., December 1, 2005; 73(12): 8167 - 8178. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. V. Alsaker and E. T. Papoutsakis Transcriptional Program of Early Sporulation and Stationary-Phase Events in Clostridium acetobutylicum J. Bacteriol., October 15, 2005; 187(20): 7103 - 7118. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. S. Makarova, Y. I. Wolf, S. L. Mekhedov, B. G. Mirkin, and E. V. Koonin Ancestral paralogs and pseudoparalogs and their role in the emergence of the eukaryotic cell Nucleic Acids Res., August 16, 2005; 33(14): 4626 - 4638. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Tauch, O. Kaiser, T. Hain, A. Goesmann, B. Weisshaar, A. Albersmeier, T. Bekel, N. Bischoff, I. Brune, T. Chakraborty, et al. Complete Genome Sequence and Analysis of the Multiresistant Nosocomial Pathogen Corynebacterium jeikeium K411, a Lipid-Requiring Bacterium of the Human Skin Flora J. Bacteriol., July 1, 2005; 187(13): 4671 - 4682. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Jaroszewski, L. Rychlewski, Z. Li, W. Li, and A. Godzik FFAS03: a server for profile-profile sequence alignments Nucleic Acids Res., July 1, 2005; 33(suppl_2): W284 - W288. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. H. Majoros, M. Pertea, and S. L. Salzberg Efficient implementation of a generalized pair hidden Markov model for comparative gene finding Bioinformatics, May 1, 2005; 21(9): 1782 - 1788. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Horan, J. Lauricha, J. Bailey-Serres, N. Raikhel, and T. Girke Genome Cluster Database. A Sequence Family Analysis Platform for Arabidopsis and Rice Plant Physiology, May 1, 2005; 138(1): 47 - 54. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Huang and E. K. O'Shea A Systematic High-Throughput Screen of a Yeast Deletion Collection for Mutants Defective in PHO5 Regulation Genetics, April 1, 2005; 169(4): 1859 - 1871. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.-V. Albers and A. J. M. Driessen Analysis of ATPases of putative secretion operons in the thermoacidophilic archaeon Sulfolobus solfataricus Microbiology, March 1, 2005; 151(3): 763 - 773. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Chen and D. Xu Understanding protein dispensability through machine-learning analysis of high-throughput data Bioinformatics, March 1, 2005; 21(5): 575 - 581. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Stothard and D. S. Wishart Circular genome visualization and exploration using CGView Bioinformatics, February 15, 2005; 21(4): 537 - 539. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Huang, Q. Fan, S. Lechno-Yossef, E. Wojciuch, C. P. Wolk, T. Kaneko, and S. Tabata Clustered Genes Required for the Synthesis of Heterocyst Envelope Polysaccharide in Anabaena sp. Strain PCC 7120 J. Bacteriol., February 1, 2005; 187(3): 1114 - 1123. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Montsant, K. Jabbari, U. Maheswari, and C. Bowler Comparative Genomics of the Pennate Diatom Phaeodactylum tricornutum Plant Physiology, February 1, 2005; 137(2): 500 - 513. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. X. Tran, M. J. Karbarz, X. Wang, C. R. H. Raetz, S. C. McGrath, R. J. Cotter, and M. S. Trent Periplasmic Cleavage and Modification of the 1-Phosphate Group of Helicobacter pylori Lipid A J. Biol. Chem., December 31, 2004; 279(53): 55780 - 55791. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Nanjo, N. Futamura, M. Nishiguchi, T. Igasaki, K. Shinozaki, and K. Shinohara Characterization of Full-length Enriched Expressed Sequence Tags of Stress-treated Poplar Leaves Plant Cell Physiol., December 15, 2004; 45(12): 1738 - 1748. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Zientz, T. Dandekar, and R. Gross Metabolic Interdependence of Obligate Intracellular Bacteria and Their Insect Hosts Microbiol. Mol. Biol. Rev., December 1, 2004; 68(4): 745 - 770. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Kang, D. Tavakoli, A. Tschumi, R. A. Aras, and M. J. Blaser Effect of Host Species on RecG Phenotypes in Helicobacter pylori and Escherichia coli J. Bacteriol., November 15, 2004; 186(22): 7704 - 7713. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A. Rodionov, M. S. Gelfand, and N. Hugouvieux-Cotte-Pattat Comparative genomics of the KdgR regulon in Erwinia chrysanthemi 3937 and other gamma-proteobacteria Microbiology, November 1, 2004; 150(11): 3571 - 3590. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Jaffe, N. Stange-Thomann, C. Smith, D. DeCaprio, S. Fisher, J. Butler, S. Calvo, T. Elkins, M. G. FitzGerald, N. Hafez, et al. The Complete Genome and Proteome of Mycoplasma mobile Genome Res., August 1, 2004; 14(8): 1447 - 1461. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





















