Nucleic Acids Research Advance Access originally published online on May 16, 2008
Nucleic Acids Research 2008 36(Web Server issue):W252-W254; doi:10.1093/nar/gkn270
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2008, Vol. 36, No. suppl_2 W252-W254
© 2008 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Articles |
soaPDB: a web application for searching the Protein Data Bank, organizing results, and receiving automatic email alerts
Department of Drug Design, Schering-Plough Research Institute, 2015 Galloping Hill Road, Kenilworth, NJ 07033, USA
*To whom correspondence should be addressed. Tel: +1 908 740 3515; Fax: +1 908 740 7664; Email: charles.lesburg{at}spcorp.com
Received January 2, 2008. Revised March 26, 2008. Accepted April 21, 2008.
| ABSTRACT |
|---|
|
|
|---|
soaPDB is a web application that allows generation and organization of saved PDB searches, and offers automatic email alerts. This tool is used from a web interface to store PDB searches and results in a backend relational database. Written using the Ruby on Rails open-source web framework, soaPDB is easy to deploy, maintain and customize. soaPDB is freely available upon request for local installation and is also available at http://soapdb.dyndns.org:3000.
| INTRODUCTION |
|---|
|
|
|---|
The goal of this effort is to provide a tool that facilitates the creation, organization and reproducibility of RCSB Protein Data Bank (1–3) searches. The PDB is typically accessed via their website (http://www.pdb.org), which offers a rich suite of searching and reporting tools for the vast database of macromolecular structures, including a flexible and powerful XML-based query syntax. Currently, advanced search queries are saved only for the duration of the browser session, often no longer than a few hours. To automate routine and reproducible PDB searches—especially useful for both simple and complex compound queries alike—we have created a web-based tool, which accesses the PDB's Web Services API (http://www.pdb.org/robohelp/#webservices/summary.htm). Compared with existing web services, which provides email alerts based on sequence comparisons or other keywords (4,5), the tool presented herein allows for construction of arbitrarily detailed PDB queries as well as the ability to run a soaPDB server privately within one's own institution to enable collaboration among local users without public exposure.
| DESCRIPTION AND IMPLEMENTATION |
|---|
|
|
|---|
We present soaPDB, a cross-platform web-based application that allows authenticated users to save their PDB searches and organize their results in a relational database. A public server (http://soapdb.dyndns.org:3000) is available for testing, and it is encouraged to install soaPDB locally or departmentally to take advantage of local collaboration. Email alerts are automatically sent when a user's saved search yields new results. Interaction with the PDB is performed using the Simple Object Access Protocol (SOAP, http://www.w3.org/TR/soap). Since the interface is a web browser, a number of which were tested, no installation is required for the end user thus minimizing setup and maintenance costs. The server side of this application was written using the Ruby on Rails (6) open-source web abstraction framework. This high-level language extension allows rapid and efficient web application prototyping and development with built-in support for many computer operating systems and relational database types. The soaPDB application server was developed using Microsoft Windows XP and linux. The database server tested was MySQL and Rails supports many others. The Rails architecture assures portability to the user's preferred operating system, web server and database server application including various linux distributions, Mac OS X and Windows XP.
We created this tool based on the following principles:
- Persistence, reproducibility and automation. By adding the ability to save and organize preformed searches, we aim to enhance the rich functionality available at the PDB, not to supplant it. Therefore, all result reporting is expected to be performed using the tools available from the PDB. A supporting database facilitates search persistence and reproducibility. Notification of new search results is performed using a query runner, executed automatically on a regular basis.
- Simplicity. A simple interface to one's saved searches, with an emphasis on the search results themselves, keeps this tool focused. There are very few necessary configuration options. For instance, email updates occur by default on every Wednesday, the day after weekly PDB updates. Relying on XML-based queries of any complexity, generated using the advanced search tools available via the PDB website, this simple tool should appeal to the novice and expert alike. While the soaPDB saved searches may be as complex as desired, the default options are tailored to the needs of a typical user. We envision that such a user of this web application is a member of the macromolecular structural biology, structural chemistry or structure-based drug design communities. Moreover, it is anticipated that the server would be installed locally or departmentally in order to share saved searches among coworkers via the cloning and editing functionality of soaPDB. A list of frequently-asked questions and an extensive tutorial with worked examples is provided.
- Portability and adaptability. The Rails web application framework was chosen for its cross-platform support. Moreover, since the code is not compiled, it is readily customizable to fit any circumstance. For instance, user authentication may be performed using an internal user database or via secure access to a preferred LDAP server.
| MODUS OPERANDI |
|---|
|
|
|---|
The soaPDB usage workflow typically requires the following steps, shown schematically in Figure 1.
|
Login
Upon authentication, the user is presented with a main page where his/her own and other's saved searches are listed along with number of search results (Figure 1b). The user may run, edit or delete current queries, examine other's queries or generate a new query, either by cloning existing queries or creating new ones.
New query generation
To create a new query (Figure 1a), the user may use one of three choices:
- The Sequence tab where an amino acid sequence is entered, along with E-value cutoff and optionally in the presence or absence of a ligand. The search is then performed using BLAST (7), as implemented by the PDB.
- The Keyword tab where any text can be entered, optionally in combination with the presence or absence of a ligand.
- The Advanced tab, which accepts any XML-based query from the PDB. There is a direct link to the PDB Advanced Search web page to ease XML creation.
Alternatively, queries may be cloned from the list saved searches, including those from other soaPDB users. Saved searches may be edited at any time to adjust the name, the email notification settings or the XML query itself.
Query execution and results
Upon running a query, the number of new hits is presented to the user along with the total number of hits (Figure 1b). Navigating to a list of results offers a list PDB ID codes and their release dates (Figure 1c). Results may be explored either individually or collectively by redirecting one or more resulting entries to the PDB. At that point, all of the tools at the PDB are available for use. It is noteworthy that all searching is performed by the PDB via a SOAP request, and results are stored locally by the soaPDB server.
| CONCLUSIONS |
|---|
|
|
|---|
soaPDB is a tool, which enhances the search capability of the Protein Data Bank by storing searches and allowing for their routine and reproducible execution. As a web application, usage is straightforward and platform-independent. Since this application was created using the Ruby on Rails web abstraction framework and supports a variety of database servers, the server may be easily deployed and readily customized to fit into many situations. The soaPDB code is freely available upon request from the authors.
| ACKNOWLEDGEMENTS |
|---|
Funding to pay the Open Access publication charges for this article was provided by SPRI.
Conflict of interest statement. None declared.
| Footnotes |
|---|
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors
| REFERENCES |
|---|
|
|
|---|
- Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The Protein Data Bank. Nucleic Acids Res. (2000) 28:235–242.
[Abstract/Free Full Text] - Berman HM, Henrick K, Nakamura H. Announcing the worldwide Protein Data Bank. Nat. Struct. Biol. (2003) 10:980.[CrossRef][Web of Science][Medline]
- Deshpande N, Addess KJ, Bluhm WF, Merino-Ott JC, Townsend-Merino W, Zhang Q, Knezevich C, Xie L, Chen L, Feng Z, et al. The RCSB Protein Data Bank: a redesigned query system and relational database based on the mmCIF schema. Nucleic Acids Res. (2005) 33:D233–D237.
[Abstract/Free Full Text] - Prilusky J. SeqAlert web site. (2003) http://bip.weizmann.ac.il/salertb/main (5 May 2008, date last accessed).
- National Center for Biotechnology Information web site. http://www.ncbi.nlm.nih.gov (5 May 2008, date last accessed).
- Thomas D, Hansson DH. Agile Web Development with Rails (2006) Lewisville, TX: Pragmatic Bookshelf.
- Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J. Mol. Biol. (1990) 215:403–410.[CrossRef][Web of Science][Medline]
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
