Nucleic Acids Research, 2003, Vol. 31, No. 1 258-261
© 2003 Oxford University Press
STRING: a database of predicted functional associations between proteins
1 European Molecular Biology Laboratory, Meyerhofstrasse 1, 69117 Heidelberg, Germany 2 Max-Delbrück-Centre for Molecular Medicine, Robert-Rössle-Strasse 10, 13092 Berlin, Germany 3 Nijmegen Centre for Molecular Life Sciences p/a Centre of Molecular and Biomolecular Informatics, Toernooiveld 1, 6525 ED Nijmegen, The Netherlands
*To whom correspondence should be addressed. Email: bork{at}embl-heidelberg.de
Received August 14, 2002; Accepted September 11, 2002
ABSTRACT
Functional links between proteins can often be inferred from genomic associations between the genes that encode them: groups of genes that are required for the same function tend to show similar species coverage, are often located in close proximity on the genome (in prokaryotes), and tend to be involved in gene-fusion events. The database STRING is a precomputed global resource for the exploration and analysis of these associations. Since the three types of evidence differ conceptually, and the number of predicted interactions is very large, it is essential to be able to assess and compare the significance of individual predictions. Thus, STRING contains a unique scoring-framework based on benchmarks of the different types of associations against a common reference set, integrated in a single confidence score per prediction. The graphical representation of the network of inferred, weighted protein interactions provides a high-level view of functional linkage, facilitating the analysis of modularity in biological processes. STRING is updated continuously, and currently contains 261 033 orthologs in 89 fully sequenced genomes. The database predicts functional interactions at an expected level of accuracy of at least 80% for more than half of the genes; it is online at http://www.bork.embl-heidelberg.de/STRING/.
INTRODUCTION
Proteinprotein interactions are not limited to direct physical binding. Proteins may also interact indirectlyby sharing a substrate in a metabolic pathway, by regulating each other transcriptionally, or by participating in larger multi-protein assemblies. For predicting such functional associations (including direct binding), the current growth in completed genomes offers unique opportunities through so-called genomic context or nonhomology-based inference methods (13).
These methods are based on the fact that functionally associated proteins are encoded by genes that share similar selection pressuresthe genes need to be maintained together, and regulated together, such that the encoded proteins can interact at the same time and place in the cell. This leaves signals in the genome, which become detectable above the noise of random genomic events when analyzing multiple species. For example, the need for maintaining functionally associated genes together can become visible as an agreement in occurrence-patterns across several genomes (4,5): the genes tend to be either present together, or absent togetherthey have the same phylogenetic profile. This is particularly informative when the profile is not in agreement with organismal phylogeny, as is the case when horizontal transfers or gene losses are involved (6,7). Likewise, the need for similar regulation is often reflected in a tendency of functionally associated genes to be close neighbors in prokaryotic genomes (8,9), where they generally have the same transcriptional orientation and little or no sequence between them. This suggests that they are single transcription units (operons), recurring in similar but not identical composition across several genomes (10). Finally, genes whose protein products need to interact closely in the cell have a noticeable tendency to be fused into a single gene, encoding a combined polypeptide (11,12) in which the proteins have a higher chance of interacting productively.
Optimal, user-friendly exploitation of genomic context for the prediction of functional interactions requires: (i) a benchmarked scoring scheme that integrates the three types of context and gives a confidence value for each prediction, (ii) automatic implementation and orthology assignment of the genes in newly published genomes, and (iii) easy navigation between various displays so that not only the pairwise interactions, but also the network of interactions and the presence of potential (sub)modules in the network become visible. Previous genomic context databases such as Indigo (13), the first version of STRING (14), the Clusters of Orthologous Group (COG) database (15), Predictome (16), and SNAPper (17) only rely on a single form of genomic context. Where they do include multiple forms (Predictome and COG) these are not integrated; nor do any of the databases indicate the reliability of the predictions. This indication of reliability is necessary: with the ever-increasing number of genomes, the amount of predictions can become quite large and, depending on the parameters, may include many false positives. We took the opportunity of a complete redesign of STRING to introduce such a scoring scheme, derived by integrating all three types of genomic context. Additionally, STRING is now continuously updated and the predictions are fully precomputed. Particular emphasis has been placed on fast and easy navigation, coupled to integrated visual outputs (see Fig. 1 for an example output of STRING).
|
USAGE
Users enter the database via a protein of interest, for which functional associations are to be predicted. This protein can be identified by its accession number or identifier. Alternatively, the raw amino acid sequence of the protein can be supplied (in this case, checksum lookups and similarity searches are done to identify the corresponding entry in the database). The user is then presented with a summary of the predicted functional links for the protein, ranked by estimated confidence. Further pages are accessible which summarize and explain the evidence that leads to the predictions. Additionally, a fully interactive network display is availableallowing navigation through the combined functional associations. The network display also allows iterationzooming out of a particular module and visualizing its connections to other modules. For independent computational analysis, the entire set of predictions contained in STRING is available as computer-readable flat-files through the website.
PREDICTION ALGORITHMS
The concepts behind the individual algorithms for the prediction of functional associations have all been published and validated previously; for STRING, only minor modifications were made. The requirements for the detection of gene fusions are more strict than those published previously (11,12); fused proteins are not recognized by homology, but rather by orthology of the fused parts to other, non-fused proteins (18,19).
For neighborhood evidence, a repeatedly occurring neighborhood is required, in species that are sufficiently remote to uncover functional constraints on gene order.
For the analysis of gene co-occurrence, STRING does not require perfect agreement between the occurrence of two genes, but uses a measure from information theory, mutual information (20,21), which quantifies the information gainedfrom the knowledge that one gene is presentabout the presence of another gene in the same genome. The specific algorithm used here corrects for biases in the number of genomes sequenced for a particular branch of phylogeny, by collapsing into a single node those taxa in which the presence or absence of a specific gene pair is in agreement in all the species.
SCORING-FRAMEWORK AND BENCHMARKING
The three types of genomic association each contain quantitative information (e.g. the number of times two genes occur together in an operon). Additionally, there is a positive correlation between the genomic associations and the likelihood and strength of interactions (9,21); this allows the derivation of a scoring system.
We benchmarked the various genomic associations separately (Fig. 2), based on the co-occurrence of proteins on metabolic maps in the KEGG database (22); proteins that occur on the same metabolic KEGG map are presumed to be functionally interacting, those that occur on different maps are not. For both fusion and conserved gene order, we find that the simple counting of events is insufficient; it is outperformed by a score that includes normalization by the number of species covered by the genes involved (Fig. 3).
|
|
The comparison of the different types of genomic association to the same benchmark helps to establish which scores in each method are equivalent. For example, at a fusion frequency of 0.04, 50% of the predicted pairs are on the same KEGG map, while this is only reached at a conserved gene order frequency of 0.10 (Fig. 2). This equivalency can be formalized by finding a function that describes the relation between the score and the observed accuracy. The correlations of the genomic association counts with the fraction of proteins on the same KEGG map are sigmoidal, and we, therefore, fitted them to hill-equations (Fig. 2).
The equivalency mapping makes it possible to combine the three hill-equations into a single score. We integrate the scores by multiplying the probabilities of associations not predicting a functional interaction. In this way, multiple scores can be combined to form a single score that expresses a higher confidence (Fig. 3). Combining the separate scores leads to a higher coverage at a given accuracy, specifically for the genes that score sub-optimally for all the individual genomic associations (Fig. 3). Remarkably, gene-order conservation remains clearly the most power-full method of the three (21).
DATA SOURCES, ORTHOLOGY
For information on genomes, genes, and encoded proteins, STRING relies on the annotated proteomes maintained by SWISS-PROT (23). Assignment of functional equivalence of genes across these genomes is essential for the predictions, and this information is derived from the manually curated orthology database, COGs (15). For any genomes not yet present in the COG database, orthology assignments are made by an automatic method resembling the COG procedure. This results not only in the addition of new genes to COGs, which are presently based on 43 genomes, but also in the creation of a number of additional orthologous groups (NOGs, non-supervised orthologous groups) (see http://www.bork.embl-heidelberg.de/STRING/ for details on the orthology assignment procedure. Essentially, assignments are based on triangles of reciprocal best matches between species in all-against-all SmithWaterman searches, allowing for recent duplications within the genome, and including a clean-up step to join remaining genes by simple bidirectional hits).
STRING uses a relational database system (PostgreSQL, http://www.postgresql.org) to store primary data, such as genes and genomic locations. Periodically, complete all-against-all runs of the prediction algorithms are performed, and the resulting functional associations are stored in the database system as well. Precomputed results are stored at several levels of detail, allowing for very fast navigation through the predictions.
ACKNOWLEDGEMENTS
This work was supported in part by grants from the Netherlands Organization for Scientific Research (NWO), from the Deutsche Forschungsgemeinschaft, and from the Bundesministerium für Forschung und Bildung, Germany, through its contribution to the Helmholtz Network for Bioinformatics.
REFERENCES
- Galperin,M.Y. and Koonin,E.V. (2000) Who's your neighbor? New computational approaches for functional genomics. Nat. Biotechnol., 18, 609613.[CrossRef][Web of Science][Medline]
- Marcotte,E.M. (2000) Computational genetics: finding protein function by nonhomology methods. Curr. Opin. Struct. Biol., 10, 359365.[CrossRef][Web of Science][Medline]
- Huynen,M., Snel,B., Lathe,W. and Bork,P. (2000) Exploitation of gene context. Curr. Opin. Struct. Biol., 10, 366370.[CrossRef][Web of Science][Medline]
- Pellegrini,M., Marcotte,E.M., Thompson,M.J., Eisenberg,D. and Yeates,T.O. (1999) Assigning protein functions by comparative genome analysis: protein phylogenetic profiles. Proc. Natl Acad. Sci. USA, 96, 42854288.
[Abstract/Free Full Text] - Huynen,M.A. and Bork,P. (1998) Measuring genome evolution. Proc. Natl Acad. Sci. USA, 95, 58495856.
[Abstract/Free Full Text] - Ettema,T., van der Oost,J. and Huynen,M. (2001) Modularity in the gain and loss of genes: applications for function prediction. Trends Genet., 17, 485487.[CrossRef][Web of Science][Medline]
- Aravind,L., Watanabe,H., Lipman,D.J. and Koonin,E.V. (2000) Lineage-specific loss and divergence of functionally linked genes in eukaryotes. Proc. Natl Acad. Sci. USA, 97, 1131911324.
[Abstract/Free Full Text] - Overbeek,R., Fonstein,M., D'Souza,M., Pusch,G.D. and Maltsev,N. (1999) The use of gene clusters to infer functional coupling. Proc. Natl Acad. Sci. USA, 96, 28962901.
[Abstract/Free Full Text] - Dandekar,T., Snel,B., Huynen,M. and Bork,P. (1998) Conservation of gene order: a fingerprint of proteins that physically interact. Trends Biochem. Sci., 23, 324328.[CrossRef][Web of Science][Medline]
- Lathe, III,W.C., Snel,B. and Bork,P. (2000) Gene context conservation of a higher order than operons. Trends Biochem. Sci., 25, 474479.[CrossRef][Web of Science][Medline]
- Enright,A.J., Iliopoulos,I., Kyrpides,N.C. and Ouzounis,C.A. (1999) Protein interaction maps for complete genomes based on gene fusion events. Nature, 402, 8690.[CrossRef][Medline]
- Marcotte,E.M., Pellegrini,M., Ng,H.L., Rice,D.W., Yeates,T.O. and Eisenberg,D. (1999) Detecting protein function and protein-protein interactions from genome sequences. Science, 285, 751753.
[Abstract/Free Full Text] - Nitschke,P., Guerdoux-Jamet,P., Chiapello,H., Faroux,G., Henaut,C., Henaut,A. and Danchin,A. (1998) Indigo: a World-Wide-Web review of genomes and gene functions. FEMS Microbiol. Rev., 22, 207227.[Web of Science][Medline]
- Snel,B., Lehmann,G., Bork,P. and Huynen,M.A. (2000) STRING: a web-server to retrieve and display the repeatedly occurring neighbourhood of a gene. Nucleic Acids Res., 28, 34423444.
[Abstract/Free Full Text] - Tatusov,R.L., Natale,D.A., Garkavtsev,I.V., Tatusova,T.A., Shankavaram,U.T., Rao,B.S., Kiryutin,B., Galperin,M.Y., Fedorova,N.D. and Koonin,E.V. (2001) The COG database: new developments in phylogenetic classification of proteins from complete genomes. Nucleic Acids Res., 29, 2228.
[Abstract/Free Full Text] - Mellor,J.C., Yanai,I., Clodfelter,K.H., Mintseris,J. and DeLisi,C. (2002) Predictome: a database of putative functional links between proteins. Nucleic Acids Res., 30, 306309.
[Abstract/Free Full Text] - Kolesov,G., Mewes,H.W. and Frishman,D. (2002) SNAPper: gene order predicts gene function. Bioinformatics, 18, 10171019.
[Abstract/Free Full Text] - Yanai,I., Derti,A. and DeLisi,C. (2001) Genes linked by fusion events are generally of the same functional category: a systematic analysis of 30 microbial genomes. Proc. Natl Acad. Sci. USA, 98, 79407945.
[Abstract/Free Full Text] - Snel,B., Bork,P. and Huynen,M. (2000) Genome evolution. Gene fusion versus gene fission. Trends Genet., 16, 911.[Web of Science][Medline]
- Kullback,S. (1959) Information Theory and Statistics. John Wiley & Sons Inc., New York, NY, pp. 111.
- Huynen,M., Snel,B., Lathe,W.,III and Bork,P. (2000) Predicting protein function by genomic context: quantitative evaluation and qualitative inferences. Genome Res., 10, 12041210.
[Abstract/Free Full Text] - Kanehisa,M. and Goto,S. (2000) KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res., 28, 2730.
[Abstract/Free Full Text] - Bairoch,A. and Apweiler,R. (2000) The SWISS-PROT protein sequence database and its supplement TrEMBL in 2000. Nucleic Acids Res., 28, 4548.
[Abstract/Free Full Text] - Torriani,A. (1990) From cell membrane to nucleotides: the phosphate regulon in Escherichia coli. Bioessays, 12, 371376.[CrossRef][Web of Science][Medline]
This article has been cited by other articles:
![]() |
S. R. Ramakrishnan, C. Vogel, T. Kwon, L. O. Penalva, E. M. Marcotte, and D. P. Miranker Mining gene functional networks to improve mass-spectrometry-based protein identification Bioinformatics, November 15, 2009; 25(22): 2955 - 2961. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Lewandrowski, S. Wortelkamp, K. Lohrig, R. P. Zahedi, D. A. Wolters, U. Walter, and A. Sickmann Platelet membrane proteomics: a novel repository for functional research Blood, July 2, 2009; 114(1): e10 - e19. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Minguez, S. Gotz, D. Montaner, F. Al-Shahrour, and J. Dopazo SNOW, a web-based tool for the statistical analysis of protein-protein interaction networks Nucleic Acids Res., July 1, 2009; 37(suppl_2): W109 - W114. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Maslov, S. Krishna, T. Y. Pang, and K. Sneppen Toolbox model of evolution of prokaryotic metabolic networks and their regulation PNAS, June 16, 2009; 106(24): 9743 - 9748. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Wang, B. Kakaradov, S. R. Collins, L. Karotki, D. Fiedler, M. Shales, K. M. Shokat, T. C. Walther, N. J. Krogan, and D. Koller A Complex-based Reconstruction of the Saccharomyces cerevisiae Interactome Mol. Cell. Proteomics, June 1, 2009; 8(6): 1361 - 1381. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Itoh and H. Watanabe CGAS: comparative genomic analysis server Bioinformatics, April 1, 2009; 25(7): 958 - 959. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Zheng, J. Sheng, C. Wang, X. Wang, Y. Yu, Y. Li, A. Michie, J. Dai, Y. Zhong, P. Hao, et al. MPSQ: a web tool for protein-state searching Bioinformatics, October 15, 2008; 24(20): 2412 - 2413. [Abstract] [Full Text] [PDF] |
||||
![]() |
M.-E. Caruso, S. Jenna, M. Bouchecareilh, D. L. Baillie, D. Boismenu, D. Halawani, M. Latterich, and E. Chevet GTPase-Mediated Regulation of the Unfolded Protein Response in Caenorhabditis elegans Is Dependent on the AAA+ ATPase CDC-48 Mol. Cell. Biol., July 1, 2008; 28(13): 4261 - 4274. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. M. Pasini, M. Kirkegaard, D. Salerno, P. Mortensen, M. Mann, and A. W. Thomas Deep Coverage Mouse Red Blood Cell Proteome: A First Comparison with the Human Red Blood Cell Mol. Cell. Proteomics, July 1, 2008; 7(7): 1317 - 1330. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Tamura and P. D'haeseleer Microbial genotype-phenotype mapping by class association rule mining Bioinformatics, July 1, 2008; 24(13): 1523 - 1529. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Park, B.-C. Kim, S.-W. Cho, S.-J. Park, J.-S. Choi, S. I. Kim, J. Bhak, and S. Lee MassNet: a functional annotation service for protein mass spectrometry data Nucleic Acids Res., July 1, 2008; 36(suppl_2): W491 - W495. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Ideker and R. Sharan Protein networks in disease Genome Res., April 1, 2008; 18(4): 644 - 652. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. X. Cordero, B. Snel, and P. Hogeweg Coevolution of gene families in prokaryotes Genome Res., March 1, 2008; 18(3): 462 - 468. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. R Kensche, V. van Noort, B. E Dutilh, and M. A Huynen Practical and theoretical advances in predicting the function of a protein by its phylogenetic distribution J R Soc Interface, February 6, 2008; 5(19): 151 - 170. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Juan, F. Pazos, and A. Valencia High-confidence prediction of global interactomes based on genome-wide coevolutionary networks PNAS, January 22, 2008; 105(3): 934 - 939. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. L. Myers and O. G. Troyanskaya Context-sensitive data integration and prediction of biological networks Bioinformatics, September 1, 2007; 23(17): 2322 - 2330. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. S. Samuel, E. E. Hansen, J. K. Manchester, P. M. Coutinho, B. Henrissat, R. Fulton, P. Latreille, K. Kim, R. K. Wilson, and J. I. Gordon Genomic and metabolic adaptations of Methanobrevibacter smithii to the human gut PNAS, June 19, 2007; 104(25): 10643 - 10648. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. A. A. Castro, J. C. M. Mombach, R. M. C. de Almeida, and J. C. F. Moreira Impaired expression of NER gene network in sporadic solid tumors Nucleic Acids Res., March 19, 2007; 35(6): 1859 - 1867. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. R. Jefferson, T. P. Walsh, T. J. Roberts, and G. J. Barton SNAPPI-DB: a database and API of Structures, iNterfaces and Alignments for Protein-Protein Interactions Nucleic Acids Res., January 12, 2007; 35(suppl_1): D580 - D589. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. von Mering, L. J. Jensen, M. Kuhn, S. Chaffron, T. Doerks, B. Kruger, B. Snel, and P. Bork STRING 7--recent developments in the integration and prediction of protein interactions Nucleic Acids Res., January 12, 2007; 35(suppl_1): D358 - D362. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Gabaldon Computational approaches for the prediction of protein function in the mitochondrion Am J Physiol Cell Physiol, December 1, 2006; 291(6): C1121 - C1128. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Sprinzak, Y. Altuvia, and H. Margalit Colloquium Papers: Characterization and prediction of protein-protein interactions within and between complexes PNAS, October 3, 2006; 103(40): 14718 - 14723. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Li, X. Li, H. Su, H. Chen, and D. W. Galbraith A framework of integrating gene relations from heterogeneous data sources: an experiment on Arabidopsis thaliana Bioinformatics, August 15, 2006; 22(16): 2037 - 2043. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Neduva and R. B. Russell DILIMOT: discovery of linear motifs in proteins. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W350 - W355. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Spirin, M. S. Gelfand, A. A. Mironov, and L. A. Mirny A metabolic network in the evolutionary context: Multiscale structure and modularity PNAS, June 6, 2006; 103(23): 8774 - 8779. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Mendez, B. Martin, R. Sanz, R. Aragues, V. Moreno, B. Oliva, V. Stresing, and A. Sierra Underexpression of transcriptional regulators is common in metastatic breast cancer cells overexpressing Bcl-xL Carcinogenesis, June 1, 2006; 27(6): 1169 - 1179. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Rawal, V. B. R. Kummarasetti, J. Ravindran, N. Kumar, K. Halder, R. Sharma, M. Mukerji, S. K. Das, and S. Chowdhury Genome-wide prediction of G4 DNA as regulatory motifs: Role in Escherichia coli global regulation. Genome Res., May 1, 2006; 16(5): 644 - 655. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Aragues, D. Jaeggi, and B. Oliva PIANA: protein interactions and network analysis Bioinformatics, April 15, 2006; 22(8): 1015 - 1017. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Campillos, C. von Mering, L. J. Jensen, and P. Bork Identification and analysis of evolutionarily cohesive functional modules in protein networks Genome Res., March 1, 2006; 16(3): 374 - 382. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. W. Mewes, D. Frishman, K. F. X. Mayer, M. Munsterkotter, O. Noubibou, P. Pagel, T. Rattei, M. Oesterheld, A. Ruepp, and V. Stumpflen MIPS: analysis and annotation of proteins from whole genomes in 2005 Nucleic Acids Res., January 1, 2006; 34(suppl_1): D169 - D172. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Espana, B. Martin, R. Aragues, C. Chiva, B. Oliva, D. Andreu, and A. Sierra Bcl-xL-Mediated Changes in Metabolic Pathways of Breast Cancer Cells: From Survival in the Blood Stream to Organ-Specific Metastasis Am. J. Pathol., October 1, 2005; 167(4): 1125 - 1137. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Falb, F. Pfeiffer, P. Palm, K. Rodewald, V. Hickmann, J. Tittor, and D. Oesterhelt Living with two extremes: Conclusions from the genome sequence of Natronomonas pharaonis Genome Res., October 1, 2005; 15(10): 1336 - 1343. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Kaper, B. Talik, T. J. Ettema, H. Bos, M. J. E. C. van der Maarel, and L. Dijkhuizen Amylomaltase of Pyrobaculum aerophilum IM2 Produces Thermoreversible Starch Gels Appl. Envir. Microbiol., September 1, 2005; 71(9): 5098 - 5106. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Espadaler, O. Romero-Isart, R. M. Jackson, and B. Oliva Prediction of protein-protein interactions using distant conservation of sequence patterns and structure relationships Bioinformatics, August 15, 2005; 21(16): 3360 - 3368. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. L. Green and P. D. Karp Genome annotation errors in pathway databases due to semantic ambiguity in partial EC numbers Nucleic Acids Res., July 20, 2005; 33(13): 4035 - 4039. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Goyer, E. Collakova, R. D. de la Garza, E. P. Quinlivan, J. Williamson, J. F. Gregory III, Y. Shachar-Hill, and A. D. Hanson 5-Formyltetrahydrofolate Is an Inhibitory but Well Tolerated Metabolite in Arabidopsis Leaves J. Biol. Chem., July 15, 2005; 280(28): 26137 - 26142. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Kunin, L. Goldovsky, N. Darzentas, and C. A. Ouzounis The net of life: Reconstructing the microbial phylogenetic network Genome Res., July 1, 2005; 15(7): 954 - 959. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Pasek, A. Bergeron, J.-L. Risler, A. Louis, E. Ollivier, and M. Raffinot Identification of genomic features using microsyntenies of domains: Domain teams Genome Res., June 1, 2005; 15(6): 867 - 874. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. V. Date and E. M. Marcotte Protein function prediction using the Protein Link EXplorer (PLEX) Bioinformatics, May 15, 2005; 21(10): 2558 - 2559. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. C. Janga, J. Collado-Vides, and G. Moreno-Hagelsieb Nebulon: a system for the inference of functional relationships of gene products from the rearrangement of predicted operons Nucleic Acids Res., May 2, 2005; 33(8): 2521 - 2530. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. R. Brown and I. Jurisica Online Predicted Human Interaction Database Bioinformatics, May 1, 2005; 21(9): 2076 - 2082. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Droit, G. G Poirier, and J. M Hunter Experimental and bioinformatic approaches for interrogating protein-protein interactions to determine protein function J. Mol. Endocrinol., April 1, 2005; 34(2): 263 - 280. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Banci, I. Bertini, S. Ciofi-Baffoni, E. Katsari, N. Katsaros, K. Kubicek, and S. Mangani A copper(I) protein possibly involved in the assembly of CuA center of bacterial cytochrome c oxidase PNAS, March 15, 2005; 102(11): 3994 - 3999. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. de Lichtenberg, L. J. Jensen, S. Brunak, and P. Bork Dynamic Complex Formation During the Yeast Cell Cycle Science, February 4, 2005; 307(5710): 724 - 727. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. von Mering, L. J. Jensen, B. Snel, S. D. Hooper, M. Krupp, M. Foglierini, N. Jouffre, M. A. Huynen, and P. Bork STRING: known and predicted protein-protein associations, integrated and transferred across organisms Nucleic Acids Res., January 1, 2005; 33(suppl_1): D433 - D437. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. M. Bowers, S. J. Cokus, D. Eisenberg, and T. O. Yeates Use of Logic Relationships to Decipher Protein Network Organization Science, December 24, 2004; 306(5705): 2246 - 2249. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Gabrielsen, C. S. Bond, I. Hallyburton, S. Hecht, A. Bacher, W. Eisenreich, F. Rohdich, and W. N. Hunter Hexameric Assembly of the Bifunctional Methylerythritol 2,4-Cyclodiphosphate Synthase and Protein-Protein Associations in the Deoxy-xylulose-dependent Pathway of Isoprenoid Precursor Biosynthesis J. Biol. Chem., December 10, 2004; 279(50): 52753 - 52761. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Doerks, C. von Mering, and P. Bork Functional clues for hypothetical proteins based on genomic context analysis in prokaryotes Nucleic Acids Res., December 1, 2004; 32(21): 6321 - 6326. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Lee, S. V. Date, A. T. Adai, and E. M. Marcotte A Probabilistic Functional Network of Yeast Genes Science, November 26, 2004; 306(5701): 1555 - 1558. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Y. Galperin and E. V. Koonin 'Conserved hypothetical' proteins: prioritization of targets for experimental study Nucleic Acids Res., October 12, 2004; 32(18): 5452 - 5463. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Zhang and S.-K. Ng InterWeaver: interaction reports for discovering potential protein interaction partners with online evidence Nucleic Acids Res., July 1, 2004; 32(suppl_2): W73 - W75. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Enault, K. Suhre, O. Poirot, C. Abergel, and J.-M. Claverie Phydbac2: improved inference of gene function using interactive phylogenomic profiling and chromosomal location analysis Nucleic Acids Res., July 1, 2004; 32(suppl_2): W336 - W339. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. J. Jensen, J. Lagarde, C. von Mering, and P. Bork ArrayProspector: a web resource of functional associations inferred from microarray expression data Nucleic Acids Res., July 1, 2004; 32(suppl_2): W445 - W448. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Kolker, K. S. Makarova, S. Shabalina, A. F. Picone, S. Purvine, T. Holzman, T. Cherny, D. Armbruster, R. S. Munson Jr, G. Kolesov, et al. Identification and functional analysis of 'hypothetical' genes expressed in Haemophilus influenzae Nucleic Acids Res., April 30, 2004; 32(8): 2353 - 2361. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. L. Arcus, K. Backbro, A. Roos, E. L. Daniel, and E. N. Baker Distant Structural Homology Leads to the Functional Characterization of an Archaeal PIN Domain as an Exonuclease J. Biol. Chem., April 16, 2004; 279(16): 16471 - 16478. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Karaoz, T. M. Murali, S. Letovsky, Y. Zheng, C. Ding, C. R. Cantor, and S. Kasif Whole-genome annotation by using evidence integration in functional-linkage networks PNAS, March 2, 2004; 101(9): 2888 - 2893. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Snel and M. A. Huynen Quantifying Modularity in the Evolution of Biomolecular Systems Genome Res., March 1, 2004; 14(3): 391 - 397. [Abstract] [Full Text] [PDF] |
||||
![]() |
A.-C. Tien, M.-H. Lin, L.-J. Su, Y.-R. Hong, T.-S. Cheng, Y.-C. G. Lee, W.-J. Lin, I. H. Still, and C.-Y. F. Huang Identification of the Substrates and Interaction Proteins of Aurora Kinases from a Protein-Protein Interaction Model Mol. Cell. Proteomics, January 1, 2004; 3(1): 93 - 104. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Suhre and J.-M. Claverie FusionDB: a database for in-depth analysis of prokaryotic gene fusion events Nucleic Acids Res., January 1, 2004; 32(90001): D273 - 276. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Hermjakob, L. Montecchi-Palazzi, C. Lewington, S. Mudali, S. Kerrien, S. Orchard, M. Vingron, B. Roechert, P. Roepstorff, A. Valencia, et al. IntAct: an open source molecular interaction database Nucleic Acids Res., January 1, 2004; 32(90001): D452 - 455. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. von Mering, E. M. Zdobnov, S. Tsoka, F. D. Ciccarelli, J. B. Pereira-Leal, C. A. Ouzounis, and P. Bork Genome evolution reveals biochemical networks and functional modules PNAS, December 23, 2003; 100(26): 15428 - 15433. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Strong, T. G. Graeber, M. Beeby, M. Pellegrini, M. J. Thompson, T. O. Yeates, and D. Eisenberg Visualization and interpretation of protein networks in Mycobacterium tuberculosis based on hierarchical clustering of genome-wide functional linkage maps Nucleic Acids Res., December 15, 2003; 31(24): 7099 - 7109. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Rediers, V. Bonnecarrere, P. B. Rainey, K. Hamonts, J. Vanderleyden, and R. De Mot Development and Application of a dapB-Based In Vivo Expression Technology System To Study Colonization of Rice by the Endophytic Nitrogen-Fixing Bacterium Pseudomonas stutzeri A15 Appl. Envir. Microbiol., November 1, 2003; 69(11): 6864 - 6874. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

















