Skip Navigation


Nucleic Acids Research Advance Access originally published online on May 8, 2007
Nucleic Acids Research 2007 35(Web Server issue):W522-W525; doi:10.1093/nar/gkm276
This Article
Right arrow Abstract Freely available
Right arrow Print PDF (97K) Freely available
Right arrow Screen PDF (109K) Freely available
Right arrowOA All Versions of this Article:
35/suppl_2/W522    most recent
gkm276v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Dolinsky, T. J.
Right arrow Articles by Baker, N. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Dolinsky, T. J.
Right arrow Articles by Baker, N. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2007, Vol. 35, No. suppl_2 W522-W525
© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.


Articles

PDB2PQR: expanding and upgrading automated preparation of biomolecular structures for molecular simulations

Todd J. Dolinsky1, Paul Czodrowski2, Hui Li3, Jens E. Nielsen4, Jan H. Jensen5, Gerhard Klebe2 and Nathan A. Baker1,*

1Department of Biochemistry and Molecular Biophysics, Center for Computational Biology, Washington University in St. Louis, 700 S. Euclid Ave., Campus Box 8036, St. Louis, MO 63110, USA, 2Department of Pharmaceutical Chemistry, Philipps-University Marburg, Marburg, Germany, 3Department of Chemistry, University of Nebraska-Lincoln, Lincoln, NE, USA, 4School of Biomolecular and Biomedical Science, UCD Conway Institute, University College Dublin, Dublin, Ireland and 5Department of Chemistry, University of Copenhagen, Copenhagen, Denmark

*To whom correspondence should be addressed. Tel: +1-314-362-2040; Fax: +1-314-362-0234; Email: baker{at}ccb.wustl.edu

Received January 31, 2007. Revised April 7, 2007. Accepted April 11, 2007.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 CONCLUSIONS
 REFERENCES
 
Real-world observable physical and chemical characteristics are increasingly being calculated from the 3D structures of biomolecules. Methods for calculating pKa values, binding constants of ligands, and changes in protein stability are readily available, but often the limiting step in computational biology is the conversion of PDB structures into formats ready for use with biomolecular simulation software. The continued sophistication and integration of biomolecular simulation methods for systems- and genome-wide studies requires a fast, robust, physically realistic and standardized protocol for preparing macromolecular structures for biophysical algorithms. As described previously, the PDB2PQR web server addresses this need for electrostatic field calculations (Dolinsky et al., Nucleic Acids Research, 32, W665–W667, 2004). Here we report the significantly expanded PDB2PQR that includes the following features: robust standalone command line support, improved pKa estimation via the PROPKA framework, ligand parameterization via PEOE_PB charge methodology, expanded set of force fields and easily incorporated user-defined parameters via XML input files, and improvement of atom addition and optimization code. These features are available through a new web interface (http://pdb2pqr.sourceforge.net/), which offers users a wide range of options for PDB file conversion, modification and parameterization.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 CONCLUSIONS
 REFERENCES
 
Due to the importance of electrostatic interactions in biomolecular systems, a variety of computational methods have been developed for evaluating electrostatic forces and energies [see (1–6) and references therein]. Typical computational electrostatics methods for biomolecular systems can be loosely grouped into two categories: ‘explicit solvent’ methods, which treat solvent molecules in full molecular detail, and ‘implicit solvent’ methods, which include solvent–solute interactions in averaged or continuum fashion. Implicit solvent methods are, by definition, limited in detail and therefore lack the atomic-scale accuracy of their explicit solvent counterparts. However, implicit solvent methods have gained increasing popularity, in part due to their elimination of the extensive sampling of solvent configurations required with explicit models (1,3–7).

The basic ingredients of an implicit solvent electrostatics calculation are environmental parameters such as temperature, solvent dielectric and ionic strength; biomolecular atomic coordinates; and parameters for atomic charges and radii. While the environmental parameters are relatively straightforward to specify, the remaining two ingredients can often be difficult to supply. In particular, most biomolecular structures in the Protein Data Bank (PDB) (8) do not contain hydrogen atoms, and many are also missing a fraction of the heavy atom coordinates. The addition of hydrogens and the reconstruction of these missing coordinates is not a trivial process; electrostatic properties obtained from the ‘repaired’ structures can often be very sensitive to the manner in which missing atoms are added and protonation states are assigned (9,10). Furthermore, inconsistent atomic nomenclature and other force field idiosyncrasies can often make the assignment of atomic charges and radii a cumbersome task. An additional obstacle to the use of PDB structures in electrostatics calculations and other biomolecular computational tasks is the accurate assignment of parameters to ‘non-standard’ residues and ligands.

Previously (9), we introduced the freely available PDB2PQR service (http://pdb2pqr.sf.net/), which was designed to facilitate the setup and execution of continuum electrostatics calculations from PDB data, particularly by non-experts. The original PDB2PQR server automated many of the common tasks of preparing structures for continuum electrostatics calculations, including adding a limited number of missing heavy atoms to biomolecular structures, estimating titration states and protonating biomolecules in a manner consistent with favorable hydrogen bonding, assigning charge and radius parameters from a variety of force fields, and finally generating ‘PQR’ output (a PDB-like format with the occupancy and temperature factor columns replaced with charge ‘Q’ and radius ‘R’, respectively) compatible with several popular computational biology electrostatics [APBS (10) and MEAD (11)], docking [AutoDock (12)], simulation [AMBER (13)] and visualization [VMD (14), PyMOL (15) and PMV (16)] packages. Since its inception, we have continued to expand the capabilities of the PDB2PQR server to address the challenges associated with ligand parameterization in PDB files and to include several new features.


    METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 CONCLUSIONS
 REFERENCES
 
The PDB2PQR web service is driven by a modular, Python-based collection of routines, which provides considerable flexibility to the software and permits non-interactive, high-throughput usage. The service is available via a number of web mirrors listed at http://pdb2pqr.sf.net/. The source code is also available for download from this link, and due to the portability of Python, PDB2PQR can be executed on a wide range of platforms.

Figure 1 outlines the typical workflow of a PDB2PQR job and summarizes the features described in more detail below. The procedures for reconstruction of missing atoms, hydrogen optimization and APBS input generation were described previously (9) and are essentially unchanged in the current version of the software. Since their initial development, these atom reconstruction options have been greatly improved through a number of bug fixes and code optimization, robust support for separate biomolecular chains, and improved chain termini optimization. The following sections describe modified and new elements of the PDB2PQR pipeline.


Figure 1
View larger version (19K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 1. Flowchart demonstrating the sequence of operations performed by the pipeline. The process begins with an input PDB file and ends with a parameterized PQR file and, optionally, an APBS input file.

 
Titration state assignment by PROPKA
Protonation states for titratable protein groups are assigned by PROPKA 1.0 (http://propka.ki.ku.dk) (17). PROPKA utilizes a very fast empirical method to predict pKa values and is successful at predicting unusual pKa values. Recently, a comparative study of several protein pKa prediction methods showed that PROPKA was the most accurate method overall (18). PROPKA uses a heuristic method to compute the pKa perturbations due to desolvation, hydrogen bonding and charge–charge interactions. In the current version of PROPKA, contributions from nucleic acids as well as heteroatoms such as bound ions or ligands to the pKa values are not included. Note that, during the course of titration state assignment, PROPKA generates statistics on residue hydrogen bonding, location and solvent accessibility and Coulombic interactions. This information is available to users as a downloadable text file provided at the end of the PDB2PQR/PROPKA calculation.

Standard residue parameter assignment
PDB2PQR currently allows users to assign protein and (where available) nucleic acid parameters based on explicit solvent AMBER99 (19) and CHARMM27 (20) force fields, the PARSE continuum electrostatics force field (21), a Poisson–Boltzmann-optimized force field by Tan et al. (22), or user-defined force fields. User-defined parameters can be uploaded to the PDB2PQR server in a simple flat-file format described in the PDB2PQR user guide. Additionally, PDB2PQR output can be customized to include a variety of atom naming schemes, including AMBER99 (19), CHARMM22 (20), PARSE (21) and an internal naming scheme based on the IUPAC naming recommendations (23). This flexibility in nomenclature was included to facilitate import of PDB2PQR output into other modeling packages. Additionally, the web server provides a ‘map’ which is output at the end of every PDB2PQR calculation and presents a table of atoms’ name/number, residue name, chain name, AMBER atom type and CHARMM atom type to aid in the interpretation of parameter assignment and the development of user-defined charges and radii.

Ligand parameter assignment
The calculation of ligand charges necessitates detailed information on molecular structure and protonation states due to the large variation in the covalent structures of small-molecule protein ligands. The current version of PDB2PQR therefore requires the ligand structure, protonation state and formal charge to be specified by the user in the popular MOL2 (24) format. Ligand structures in MOL2 format are readily available from popular molecular modeling software and free web services such as PRODRG (25). Future versions of PDB2PQR will include a pdb2mol2 parser and automatic assignment of default ligand protonation states from a small-molecule pKa database.

The calculation of ligand charges in PDB2PQR is based on the partial equalization of orbital electronegativities (PEOE) procedure developed by Gasteiger and Marsili (26). In the PEOE procedure, orbital electronegativities {chi} are linked to partial atomic charges q by a polynomial expansion ({chi} = a +b·q + c·q2 + d·q3). The coefficients a, b, c and d were optimized by Gasteiger and Marsili using gas phase data on ionization potentials and electron affinities. We utilize a PEOE algorithm, which has been optimized by Czodrowski et al. to obtain better agreement between theoretical and experimental solvation energies for a set of small molecules including the polar amino acids (27). The resulting PEOE_PB charges have been tested for small-molecule complexes with trypsin, thrombin (28) and HIV protease (29), and have been found to give results that are in agreement with experimental values.

Post-processing
The current version of PDB2PQR supports an ‘extension’ directory for user-defined processing of PDB2PQR output. Such extensions might include alternative naming schemes, identification and parameterization of other molecule types, additional hydrogen bond processing, etc. The web servers listed at (http://pdb2pqr.sf.net/) provide only the default PDB2PQR functionality. However, it is straightforward for users to download the PDB2PQR software and setup their own web servers with additional functionality based on custom extensions.


    CONCLUSIONS
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 CONCLUSIONS
 REFERENCES
 
We have described a number of new features for the free PDB2PQR web server, a service which helps users prepare molecular structures for further computational work by modeling missing atoms, assigning charges and titration states, and providing a mechanism for assignment of ligand parameters. Readers interested in these tasks might also be interested in other servers, which provide complementary services for biomolecular structure processing (30–32). Planned future developments for PDB2PQR include the construction of a pdb2mol2 parser to allow for the automatic parameterization of non-protein atoms, the correct treatment of protein post-translational modification, and the integration of a Poisson–Boltzmann continuum electrostatics-based pKa calculation algorithm into PDB2PQR. We anticipate that the PDB2PQR service will continue to be a helpful addition to the portfolio of tools available to the structural and computational biology communities.


    ACKNOWLEDGEMENTS
 
N.A.B. and T.J.D. were supported by NIH grant GM069702 and the National Biomedical Computation resource (NIH P41 RR08605); J.H.J. and H.L. were supported by NSF grant MCB 0209941; J.H.J. gratefully acknowledges a Skou Fellowship from the Danish Natural Science Research Council; J.E.N. was supported by a Science Foundation Ireland PIYRA grant (04/YI1/M537); G.K. and P.C. were financially supported by the bilateral CERC3 program of CNRS and DFG (KL 1204/3). The authors would like to thank Andy McCammon for contributions to and support of early versions of the PDB2PQR effort. Funding to pay the Open Access publication charges for this article was provided by NIH grant GM069702.

Conflict of interest statement. None declared.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 METHODS
 CONCLUSIONS
 REFERENCES
 

  1. Baker NA. Improving implicit solvent simulations: a Poisson-centric view. Curr. Opin. Struct. Biol (2005) 15:137–143.[CrossRef][Web of Science][Medline]

  2. Darden TA. Computational Biochemistry and Biophysics—Becker OM, MacKerell AD Jr, Roux B, Watanabe M, eds. (2001) New York: Marcel Dekker Inc. 91–114.

  3. Roux B. Computational Biochemistry and Biophysics—Becker OM, MacKerell AD Jr, Roux B, Watanabe M, eds. (2001) New York: Marcel Dekker. 133–152.

  4. Davis ME, McCammon JA. Electrostatics in biomolecular structure and dynamics. Chem. Rev (1990) 94:7684–7692.

  5. Honig B, Nicholls A. Classical electrostatics in biology and chemistry. Science (1995) 268:1144–1149.[Abstract/Free Full Text]

  6. Warshel A, Sharma PK, Kato M, Parson WW. Modeling electrostatic effects in proteins. Biochim. Biophys. Acta Proteins Proteomics (2006) 1764:1647–1676.[CrossRef]

  7. Roux B, Simonson T. Implicit solvent models. Biophys. Chem (1999) 78:1–20.[CrossRef][Web of Science][Medline]

  8. Berman HM, Westbrook J, Feng Z, Gilliland G, Bhat TN, Weissig H, Shindyalov IN, Bourne PE. The protein data bank. Nucleic Acids Res (2000) 28:235–242.[Abstract/Free Full Text]

  9. Dolinsky TJ, Nielsen JE, McCammon JA, Baker NA. PDB2PQR: an automated pipeline for the setup, execution, and analysis of Poisson–Boltzmann electrostatics calculations. Nucleic Acids Res (2004) 32:W665–W667.[Abstract/Free Full Text]

  10. Baker NA, Sept D, Joseph S, Holst MJ, McCammon JA. Electrostatics of nanosystems: Application to microtubules and the ribosome. Proc. Natl. Acad. Sci. USA (2001) 98:10037–10041.[Abstract/Free Full Text]

  11. Bashford D. Scientific Computing in Object-Oriented Parallel Environments—Ishikawa Y, Oldehoeft RR, Reynders JVW, Tholburn M, eds. (1997) 1343. Berlin: Springer. 233–240.

  12. Morris GM, Goodsell DS, Halliday RS, Huey R, Hart WE, Belew RK, Olson AJ. Automated docking using a Lamarckian genetic algorithm and an empirical binding free energy function. J. Comput. Chem (1998) 19:1639–1662.[CrossRef][Web of Science]

  13. Case DA, Cheatham TE III, Darden T, Gohlke H, Luo R, Merz KM Jr, Onufriev A, Simmerling C, Wang B, et al. The Amber biomolecular simulation programs. J. Comput. Chem (2005) 26:1668–1688.[CrossRef][Web of Science][Medline]

  14. Humphrey W, Dalke A, Schulten K. VMD—visual molecular dynamics. J. Mol. Graph (1996) 14:33–38.[CrossRef][Web of Science][Medline]

  15. DeLano WL. Palo Alto, CA, The PyMOL Molecular Graphics System. (2002).

  16. Sanner MF. Python: a programming language for software integration and development. J. Mol. Graph. Mod (1999) 17:57–61.[Web of Science][Medline]

  17. Li H, Robertson AD, Jensen JH. Very fast empirical prediction and rationalization of protein pKa values. Proteins (2005) 61:704–721.[CrossRef][Web of Science][Medline]

  18. Davies MN, Toseland CP, Moss DS, Flower DR. Benchmarking pKa prediction. BMC Biochemistry (2006) 7:18.[CrossRef][Medline]

  19. Wang JM, Cieplak P, Kollman PA. How well does a restrained electrostatic potential (RESP) model perform in calculating conformational energies of organic and biological molecules? J. Comput. Chem (2000) 21:1049–1074.[CrossRef][Web of Science]

  20. MacKerell AD Jr, Bashford D, Bellot M, Dunbrack RL Jr, Evanseck JD, Field MJ, Fischer S, Gao J, Guo H, et al. All-atom empirical potential for molecular modeling and dynamics studies of proteins. J. Phys. Chem. B (1998) 102:3586–3616.

  21. Sitkoff D, Sharp KA, Honig B. Accurate calculation of hydration free energies using macroscopic solvent models. J. Phys. Chem (1994) 98:1978–1988.[CrossRef][Web of Science]

  22. Tan C, Yang L, Luo R. How well does Poisson–Boltzmann implicit solvent agree with explicit solvent? A quantitative analysis. J. Phys. Chem. B (2006) 110:18680–18687.[Medline]

  23. Markley JL, Bax A, Arata Y, Hilbers CW, Kaptein R, Sykes BD, Wright PE, Wüthrich K. Recommendations for the presentation of NMR structures of proteins and nucleic acids. J. Mol. Biol (1998) 280:933–952.[CrossRef][Web of Science][Medline]

  24. SYBYL Molecular Modeling Software. (2006) 7.2 ed. St. Louis, MO: Tripos Inc. (http://www.tripos.com/mol2/mol2_format3.html).

  25. van Aalten DMF, Bywater R, Findlay JBC, Hendlich M, Hooft RWW, Vriend G. PRODRG, a program for generating molecular topologies and unique molecular descriptors from coordinates of small molecules. J. Comput. Aided Mol. Des (1996) 10:255–262.[CrossRef][Web of Science][Medline]

  26. Gasteiger J, Marsili M. Iterative partial equalization of orbital electronegativity—a rapid access to atomic charges. Tetrahedron (1980) 36:3219–3228.[CrossRef][Web of Science]

  27. Czodrowski P, Dramburg I, Sotriffer CA, Klebe G. Development, validation, and application of adapted PEOE charges to estimate pKa values of functional groups in protein-ligand complexes. Proteins (2006) 65:424–437.[CrossRef][Web of Science][Medline]

  28. Czodrowski P, Sotriffer CA, Klebe G. Protonation changes upon ligand binding to trypsin and thrombin: structural interpretation based on pKa calculations and ITC experiments. J. Mol. Biol (2007) 367:1347–1356.[CrossRef][Medline]

  29. Czodrowski P, Sotriffer CA, Klebe G. Atypical protonation states in the active site of HIV-1 protease: A computational study. J. Chem. Inform. Model (in press).

  30. Gordon JC, Myers JB, Folta T, Shoja V, Heath LS, Onufriev A. H++: a server for estimating pKas and adding missing hydrogens to macromolecules. Nucleic Acids Res (2005) 33:W368–W371.[Abstract/Free Full Text]

  31. Li X, Jacobson MP, Zhu K, Zhao S, Friesner RA. Assignment of polar states for protein amino acid residues using a interaction cluster decomposition algorithm and its application to high resolution protein structure modeling. Proteins (2007) 66:824–837.[CrossRef][Web of Science][Medline]

  32. Vriend G. WHAT IF: a molecular modeling and drug design program. J. Mol. Graph (1990) 8:52–56. 29.[CrossRef][Web of Science][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
J. Biol. Chem.Home page
P. Kumar, A. Vahedi-Faridi, W. Saenger, E. Merino, J. A. Lopez de Castro, B. Uchanska-Ziegler, and A. Ziegler
Structural Basis for T Cell Alloreactivity among Three HLA-B14 and HLA-B27 Antigens
J. Biol. Chem., October 23, 2009; 284(43): 29784 - 29797.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
H. Malet, B. Coutard, S. Jamal, H. Dutartre, N. Papageorgiou, M. Neuvonen, T. Ahola, N. Forrester, E. A. Gould, D. Lafitte, et al.
The Crystal Structures of Chikungunya and Venezuelan Equine Encephalitis Virus nsP3 Macro Domains Define a Conserved Adenosine Binding Pocket
J. Virol., July 1, 2009; 83(13): 6534 - 6545.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
P.-H. Lee, K.-L. Kuo, P.-Y. Chu, E. M. Liu, and J.-H. Lin
SLITHER: a web server for generating contiguous conformations of substrate molecules entering into deep active sites of proteins or migrating through channels in membrane transporters
Nucleic Acids Res., July 1, 2009; 37(suppl_2): W559 - W564.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
J. Guhaniyogi, I. Sohar, K. Das, A. M. Stock, and P. Lobel
Crystal Structure and Autoactivation Pathway of the Precursor Form of Human Tripeptidyl-peptidase 1, the Enzyme Deficient in Late Infantile Ceroid Lipofuscinosis
J. Biol. Chem., February 6, 2009; 284(6): 3985 - 3997.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
T. Matsui, G. Lander, and J. E. Johnson
Characterization of Large Conformational Changes and Autoproteolysis in the Maturation of a T=4 Virus Capsid
J. Virol., January 15, 2009; 83(2): 1126 - 1134.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (97K) Freely available
Right arrow Screen PDF (109K) Freely available
Right arrowOA All Versions of this Article:
35/suppl_2/W522    most recent
gkm276v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Dolinsky, T. J.
Right arrow Articles by Baker, N. A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Dolinsky, T. J.
Right arrow Articles by Baker, N. A.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?