Nucleic Acids Research 2004 32(Web Server Issue):W73-W75; doi:10.1093/nar/gkh437
© 2004, the authors
Nucleic Acids Research, Vol. 32, Web Server issue © Oxford University Press 2004; all rights reserved
InterWeaver: interaction reports for discovering potential protein interaction partners with online evidence
Zhuo Zhang* and
See-Kiong Ng
Knowledge Discovery Department, Institute for Infocomm Research, 21 Heng Mui Keng Terrace, Singapore 119613
* To whom correspondence should be addressed. Tel: +65 68748809; Fax: +65 67748056; Email: zzhang{at}i2r.a-star.edu.sg
Received February 6, 2004; Revised March 18, 2004; Accepted April 15, 2004
 |
ABSTRACT
|
|---|
InterWeaver is a web server for discovering potential protein
interactions with online evidence automatically extracted from
protein interaction databases, literature abstracts, domain
fusion events and domain interactions. Given a new protein sequence,
the server identifies potential interaction partners using two
approaches. In the homology-based approach, the system performs
sequence homology searches to find similar proteins in other
species, and then searches the protein interaction databases
and the biomedical literature for interaction partners. In the
domain-based approach, the system detects the domains in the
input protein sequence and searches databases of domain fusion
events and putative domain interactions to suggest potential
interacting partners. The results are compiled into a personalized
and downloadable interaction report to aid biologists in their
discovery of protein interactions. InterWeaver is freely available
for academic users at
http://interweaver.i2r.a-star.edu.sg/.
 |
INTRODUCTION
|
|---|
A rapidly increasing number of uncharacterized proteins are
being generated by large-scale proteomic studies. Understanding
the biological roles of these proteins requires knowledge of
their interactions with other proteins. Identification of protein
interactions is therefore the subject of many post-genome projects;
many of the interaction data are available online. Interactions
that were experimentally determined en masse using high-throughput
methods such as two-hybrid screening have been curated and deposited
in online interaction databases (
1,
2). A large number of interactions
reported in journals and conference papers can also be extracted
from online biomedical literature databases (
3,
4). At the same
time, computational methods have also been developed to predict
protein interactions. For example, computationally detected
domain fusion events (
5) as well as computationally derived
domaindomain interactions (
6) can be used to infer protein
interactions.
 |
METHOD
|
|---|
Given a new uncharacterized protein sequence, a biologist can
mine the rich online resources of protein interactions to discover
its potential interaction partners. We have created InterWeaver,
a server providing interaction reports, to help biologists discover
potential protein interaction partner proteins using online
evidence.
Figure 1 depicts the system framework of InterWeaver
for generating customized protein interaction reports.
The InterWeaver server currently employs two different approaches
to identify potential interaction partners:
- Homology-based approach. Proteins that are known to interact with the source protein's homologs in various selected species are mined from two different data sources: online protein interaction databases and biomedical literature. InterWeaver first performs sequence homology searches using BLAST (7) to find proteins similar to the source proteins in the other species. InterWeaver then searches the online protein interaction databases DIP (2) and BIND (1), as well as the Protein Data Bank (PDB) (8), a database containing data on protein complexes, for experimentally derived protein interactions and complexes to suggest potential protein interaction partners for the source protein. The system also scans the abstracts in the PubMed database for interactions reported in the biomedical literature using text-mining techniques (4).
- Domain-based approach. Here, proteins with domains that putatively interact with a domain in the source protein are listed as potential interaction partners. InterWeaver uses computationally derived domain fusion events (5) as well as domaindomain interactions (6) for inference. The detection of domains in the source protein is done using RPS-Blast (9).
To help biologists in their research, the online evidence for the various potential protein interaction partners is compiled together with cross-reference links to the original databases.
 |
USAGE
|
|---|
InterWeaver provides both online query and offline (batch) query
facilities. Offline queries yield interaction reports for novel
proteins. Users submit their protein sequences in FASTA format.
They may personalize their reports by specifying the species
and
E-values for BLAST, and by selecting the types of online
evidence for inclusion in their reports, namely, protein interaction
databases, biomedical literature, domain fusion events and/or
domain interactions. When results are available, users receive
an email with a password and a link to the compiled InterWeaver
report. Users can then browse their reports on the InterWeaver
site (each report will be kept on the site for two weeks) or
download their InterWeaver reports in zipped folders for offline
browsing. Users can also perform online queries for prompt results.
Figure 2 shows the result pages after searching by interaction
databases (
Figure 2A) and searching by domain fusion events
(
Figure 2B) respectively.


View larger version (83K):
[in this window]
[in a new window]
|
Figure 2. Result pages of the InterWeaver server (A) Searching for interaction partners by homolog, showing evidence from the PP interaction database. (B) Searching for interaction partners by domain fusion events.
|
|
 |
DISCUSSION
|
|---|
Online databases such as Predictome (
10) and STRING (
11) contain
putative proteinprotein interactions pre-computed using
various computational methods. Our InterWeaver system is designed
as a web server to predict potential protein interacting partners
for novel sequences using both homology- and domain-based predictive
approaches. The server uses a variety of online resources ranging
from experimentally derived protein interaction databases to
computationally derived domain interaction databases, and mines
data sources from structured databases to unstructured text
databases. In fact, InterWeaver is designed to be easily extensible
to include other online protein interaction resources and different
computational approaches. To help biologists manage the wealth
of information at their own pace, InterWeaver generates comprehensive
downloadable web reports for offline analysis.
The variety of evidence compiled about the potential interaction partners can be useful in helping biologists validate and annotate the experimental results for their proteins. However, as with other predictive tools, it is important to bear in mind that the potential interaction partners may be predicted by the system based on assumptions that are yet to be conclusively validated. For example, the accuracy of homology-based inference of protein interactions has not yet been proved with conclusive evidence based on significant datasets. However, with suitable prudence and combining evidence from different approaches and data sources, InterWeaver can serve as a useful hypothesis engine for dissecting the vast interactomes.
 |
Notes
|
|---|
The online version of this article has been published under
an open access model. Users are entitled to use, reproduce,
disseminate, or display the open access version of this article
provided that: the original authorship is properly and fully
attributed; the Journal and Oxford University Press are attributed
as the original place of publication with the correct citation
details given; if an article is subsequently reproduced or disseminated
not in its entirety but only in part or as a derivative work
this must be clearly indicated.
 |
REFERENCES
|
|---|
- Bader,G.D., Betel,D. and Hogue,C.W. ( (2003) ) BIND: the Biomolecular Interaction Network Database. Nucleic Acids Res., , 31, , 248250.[Abstract/Free Full Text]
- Xenarios,I., Salwinski,L., Duan,X.J., Higney,P., Kim,S.M. and Eisenberg,D. ( (2002) ) DIP, the Database of Interacting Proteins: a research tool for studying cellular networks of protein interactions. Nucleic Acids Res., , 30, , 303305.[Abstract/Free Full Text]
- Mack,R. and Hehenberger,M. ( (2002) ) Text-based knowledge discovery: search and mining of life-sciences documents. Drug Discov. Today, , 7, , S89S98.[CrossRef][Web of Science][Medline]
- Ng,S.K. and Wong,M. ( (1999) ) Toward routine automatic pathway discovery from on-line scientific text abstracts. Genome Inform. Ser. Workshop Genome Inform., , 10, , 104112.[Medline]
- Marcotte,E.M., Pellegrini,M., Ng,H.L., Rice,D.W., Yeates,T.O. and Eisenberg,D. ( (1999) ) Detecting protein function and proteinprotein interactions from genome sequences. Science, , 285, , 751753.[Abstract/Free Full Text]
- Ng,S.K., Zhang,Z., Tan,S.H. and Lin,K. ( (2003) ) InterDom: a database of putative interacting protein domains for validating predicted protein interactions and complexes. Nucleic Acids Res., , 31, , 251254.[Abstract/Free Full Text]
- Altschul,S.F., Gish,W., Miller,W., Myers,E.W. and Lipman,D.J. ( (1990) ) Basic local alignment search tool. J. Mol. Biol., , 215, , 403410.[CrossRef][Web of Science][Medline]
- Berman,H.M., Westbrook,J., Feng,Z., Gilliland,G., Bhat,T.N., Weissig,H., Shindyalov,I.N. and Bourne,P.E. ( (2000) ) The Protein Data Bank. Nucleic Acids Res., , 28, , 235242.[Abstract/Free Full Text]
- Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. ( (1997) ) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., , 25, , 33893402.[Abstract/Free Full Text]
- Mellor,J.C., Yanai,I., Clodfelter,K.H., Mintseris,J. and DeLisi,C. ( (2002) ) Predictome: a database of putative functional links between proteins. Nucleic Acids Res., , 30, , 306309.[Abstract/Free Full Text]
- von Mering,C., Huynen,M., Jaeggi,D., Schmidt,S., Bork,P. and Snel,B. ( (2003) ) STRING: a database of predicted functional associations between proteins. Nucleic Acids Res., , 31, , 258261.[Abstract/Free Full Text]

CiteULike
Connotea
Del.icio.us What's this?