Nucleic Acids Research, 2005, Vol. 33, Database issue D266-D268
© 2005, the authors
Nucleic Acids Research, Vol. 33, Database issue © Oxford University Press 2005; all rights reserved
PDBsum more: new summaries and analyses of the known 3D structures of proteins and nucleic acids
European Bioinformatics Institute, Wellcome Trust Genome Campus, Hinxton, Cambridge CB10 1SD, UK
* To whom correspondence should be addressed. Tel: +44 1223 492 542; Fax: +44 1223 494 468; Email: roman{at}ebi.ac.uk
Received August 26, 2004; Accepted August 29, 2004
| ABSTRACT |
|---|
|
|
|---|
PDBsum is a database of mainly pictorial summaries of the 3D structures of proteins and nucleic acids in the Protein Data Bank. Its pages aim to provide an at-a-glance view of the contents of every 3D structure, plus detailed structural analyses of each protein chain, DNARNA chain and any bound ligands and metals. In the past year, the database has been significantly improved, in terms of both appearance and new content. Moreover, it has moved to its new address at http://www.ebi.ac.uk/thornton-srv/databases/pdbsum.
| INTRODUCTION |
|---|
|
|
|---|
The PDBsum database was created at University College London in 1995 (1,2). Its aim was to provide an illustrated and informative summary for each of the 3D structures released by the Protein Data Bank (PDB) (3).
As of July 1, 2004, the database has been transferred to the European Bioinformatics Institute having had a complete facelift and many new analyses and links added to it. Its new address is http://www.ebi.ac.uk/thornton-srv/databases/pdbsum. We describe in this paper, some of the improvements that have been made and the new features that have been added.
| NEW LAYOUT |
|---|
|
|
|---|
The most obvious change that has been made is to the appearance of the web pages. These have been modernized, simplified and structured in a more logical manner and are now generated dynamically. Each structure's home page now provides a thumbnail image(s) of the structure plus, below it, an index listing the molecules it contains, in terms of protein chain(s), DNARNA chains, small-molecule ligands, metal ions and number of water molecules. Clicking on the items in the index takes you to the analyses provided for that molecule type (secondary structure diagrams for protein chains, proteinligand interactions for the ligands, and so on). The index thus provides an at-a-glance summary of the molecules contained in the PDB entry.
Much duplication of redundant information has been removed. Thus, for example, where a structure contains multiple copies of the same protein chain, only a representative chain is described in detail; previously all structures were rather unnecessarily described. This is reflected in the index, which groups together or separates the protein chains accordingly. So you can immediately see that, say, the structure consists of four chains (A, B, C and D) which are all equivalent, or conversely, that the structure consists of two dissimilar protein chains, A and B, etc. Similarly, for ligands, multiple copies of the same ligand, making identical interactions with equivalent protein chains, are now shown only once.
In addition to the thumbnail image and index of contents, the home page of each entry also provides the usual descriptive information (such as title, authors, date of deposition), links to other sequence and structure databases, summary PROCHECK (4) analyses and a button for viewing the structure in RasMol (5).
A novel feature is a link to a server that allows you to automatically generate your own image of the structure via MolScript (6) and Raster3d (7). Another new feature, for most enzyme structures, is a diagram of the reaction catalysed by the enzyme. The diagram shows chemical drawings of the reactants, products and, where relevant, cofactors. The drawings are generated from mol2 files that were downloaded from the KEGG (8) ftp site. Of particular interest are structures where the bound ligand corresponds to, or is similar, to one of the molecules involved in the reaction. These are identified on the diagram with their percentage similarity to the molecule in question. Similarities are calculated by using a simple graph-match between the atom types and connectivities of the structure's ligands and the reaction molecules. Figure 1 shows an example.
|
| PROTEIN PAGES |
|---|
|
|
|---|
Each representative protein chain in a given structure has its own page holding a wiring diagram of its secondary structure, plus domain organization as given in the CATH fold classification database (9). As before, a detailed analysis of the secondary structure motifs is provided, via PROMOTIF (10), and any valid PROSITE patterns (11) contained within the sequence are mapped to the 3D structure (12) via RasMol.
There are two new features on these pages. The first is the thumbnail image, which shows the chain in question in solid representation, any identical chains as semi-transparent and all other molecules in the structure as transparent. Clicking on the image brings up a picture of the chain itself. For large and complex structures, this can help locate the chain in the structure as a whole.
The second novel feature is the inclusion of residue conservation data, where available. It is well known that highly conserved residues are usually crucial to the function of the protein, and their location on the surface of the protein can pinpoint the functionally active region. The conservation of each residue is computed by the ConSurf (13) program, which uses multiple sequence alignments of the protein chain against homologues in the sequence databases. The residues are coloured according to their conservation score on the wiring diagram and a RasMol view of the protein's surface shows the most and least highly conserved regions on the surface (see Figure 2). An alternative view of the 3D structure, again using RasMol, is provided by using the ConSurf colouring scheme.
|
| LIGAND AND METAL ION PAGES |
|---|
|
|
|---|
The ligand pages show the various ligand molecules and metal ions bound to the protein or DNA molecules in the structure. Where there are many instances of the same ligand or metal, only a representative example is given; identical molecules making identical interactions with equivalent protein chains are merely listed. Such rationalization is necessary as some structures these days have staggeringly large numbers of bound ligandssee for example PDB code 1qzv [PDB] , which has no fewer than 334 alpha-chlorophyll A molecules, plus others, bound to a large complex of 32 protein chains corresponding to plant photosystem I.
THE ENZYME STRUCTURES DATABASE (EC PDB)
|
|---|
|
|
|---|
The Enzyme Structures Database, http://www.ebi.ac.uk/thornton-srv/databases/enzymes, is a subset of PDBsum, which provides a separate grouping of all the enzymes structures in the PDB, classified by their enzyme classification (EC) numbers (14). The database preserves the hierarchy of the EC numbering scheme, showing the number of PDB structures belonging to the class at each level. At the lowest level, the listed PDB codes link directly to their PDBsum pages. Where any of the listed structures contain ligands that resemble, or correspond to, any of the reaction molecules, this resemblance is given by a percentage similarity. This helps identify structures, belonging to a specific enzyme class, which may be the most informative in terms of where and how the cognate ligand(s) bind.
The EC hierarchy, descriptions, reactions and reaction molecules are obtained from the ENZYME database (15). The molecule definitions are downloaded as mol2 files from the KEGG ftp site, as mentioned above.
| PDBsum HIGHLIGHTS |
|---|
|
|
|---|
A new feature, accessed from the PDBsum home page, is the Highlights page. This tabulates the most extreme entries in the database in terms of various attributes: oldest depositions, youngest, largest, smallest, longest chain, most ligands, highest resolution, lowest, and so on (see Figure 3). This helps locate some of the more unusual structures that have been solved to date! More highlights are planned as the PDB is full of the weird and the wonderful.
|
| ACKNOWLEDGEMENTS |
|---|
We would like to thank the MSD group for making their database available for the extraction of PDB to SWISS-PROT and EC number mappings. We also thank Dr Nir-Ben Tal for making the ConSurf residue conservation data available and Dr Fabian Glaser and Yossi Rosenberg for setting up an ftp mirror for regular retrieval of these data. The development of the new version of PDBsum has been funded by the Wellcome Trust.
| Notes |
|---|
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use permissions, please contact journals.permissions{at}oupjournals.org.
| REFERENCES |
|---|
|
|
|---|
- Laskowski,R.A., Hutchinson,E.G., Michie,A.D., Wallace,A.C., Jones,M.L., Thornton,J.M. ( (1997) ) PDBsum: a web-based database of summaries and analyses of all PDB structures. Trends Biochem. Sci., , 22, , 488490.[CrossRef][Web of Science][Medline] .
- Laskowski,R.A. ( (2001) ) PDBsum: summaries and analyses of PDB structures. Nucleic Acids Res., , 29, , 221222.
[Abstract/Free Full Text] . - H.M. Berman,H.M., Westbrook,J., Feng,Z., Gilliland,G., Bhat,T.N., Weissig,H., Shindyalov,I.N., Bourne,P.E. ( (2000) ) The Protein Data Bank. Nucleic Acids Res., , 28, , 235242.
[Abstract/Free Full Text] . - Laskowski,R.A., MacArthur,M.W., Moss,D.S., Thornton,J.M. ( (1993) ) PROCHECKa program to check the stereochemical quality of protein structures. J. Appl. Crystallogr., , 26, , 283291.[CrossRef] .
- Sayle,R.A. and Milner-White,E.J. ( (1995) ) RASMOL: biomolecular graphics for all. Trends Biochem. Sci., , 20, , 374376.[CrossRef][Web of Science][Medline] .
- Kraulis,P.J. ( (1991) ) MOLSCRIPT: a program to produce both detailed and schematic plots of protein structures. J. Appl. Crystallogr., , 24, , 946950.[CrossRef] .
- Merritt,E.A. and Bacon,D.J. ( (1997) ) Raster3D: photorealistic molecular graphics. Methods Enzymol., , 277, , 505524.[Web of Science][Medline] .
- Kanehisa,M., Goto,S. ( (2000) ) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res., , 28, , 2730.
[Abstract/Free Full Text] . - Pearl,F.M.G, Lee,D., Bray,J.E, Sillitoe,I., Todd,A.E., Harrison,A.P., Thornton,J.M. and Orengo,C.A. ( (2000) ) Assigning genomic sequences to CATH. Nucleic Acids Res., , 28, , 277282.
[Abstract/Free Full Text] . - Hutchinson,E.G. and Thornton,J.M. ( (1996) ) PROMOTIF a program to identify and analyze structural motifs in proteins. Prot. Sci., , 5, , 212220.[Web of Science][Medline] .
- Hulo,N., Sigrist,C.J.A., Le Saux,V., Langendijk-Genevaux,P.S., Bordoli,L., Gattiker,A., De Castro,E., Bucher,P. and Bairoch,A. ( (2004) ) Recent improvements to the PROSITE database. Nucleic Acids Res., , 32, , D134D137.
[Abstract/Free Full Text] . - Kasuya,A. and Thornton,J.M. ( (1999) ) Three-dimensional structure analysis of PROSITE patterns. J. Mol. Biol., , 286, , 16731691.[CrossRef][Web of Science][Medline] .
- Glaser,F., Pupko,T., Paz,I., Bell,R.E., Bechor-Shental,D., Martz,E. and Ben-Tal,N. ( (2003) ) ConSurf: identification of functional regions in proteins by surface-mapping of phylogenetic information. Bioinformatics, , 19, , 163164.
[Abstract/Free Full Text] . - Bielka,H., Dixon,H.B.F., Karlson,P., Liebecq,C., Sharon,N., van Lenten,E.J., Velick,S.F., Vliegenthart,J.F.G. and Webb,E.C. ( (1992) ) E.C. enzyme nomenclature 1992: recommendations of the Nomenclature Committee of the International Union Of Biochemistry and Molecular Biology on the nomenclature and classification of enzymes. Nomenclature Committee of the International Union of Biochemistry. Academic Press, Inc., Ltd, London. .
- Bairoch,A. ( (2000) ) The ENZYME database in 2000. Nucleic Acids Res., , 28, , 304305.
[Abstract/Free Full Text] .
This article has been cited by other articles:
![]() |
P. May, A. Kreuchwig, T. Steinke, and I. Koch PTGL: a database for secondary structure-based protein topologies Nucleic Acids Res., November 11, 2009; (2009) gkp980v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Encinar, G. Fernandez-Ballester, I. E. Sanchez, E. Hurtado-Gomez, F. Stricher, P. Beltrao, and L. Serrano ADAN: a database for prediction of protein-protein interaction of modular domains mediated by linear motifs Bioinformatics, September 15, 2009; 25(18): 2418 - 2424. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Talavera, R. A. Laskowski, and J. M. Thornton WSsas: a web service for the annotation of functional residues through structural homologues Bioinformatics, May 1, 2009; 25(9): 1192 - 1194. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. J. Miknis, E. F. Donaldson, T. C. Umland, R. A. Rimmer, R. S. Baric, and L. W. Schultz Severe Acute Respiratory Syndrome Coronavirus nsp9 Dimerization Is Essential for Efficient Viral Growth J. Virol., April 1, 2009; 83(7): 3007 - 3018. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. M. Quezada, S. W. Hicks, J. E. Galan, and C. E. Stebbins A family of Salmonella virulence factors functions as a distinct class of autoregulated E3 ubiquitin ligases PNAS, March 24, 2009; 106(12): 4864 - 4869. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. J. Henriques, J. V. Rodrigues, R. K. Olsen, P. Bross, and C. M. Gomes Role of Flavinylation in a Mild Variant of Multiple Acyl-CoA Dehydrogenation Deficiency: A MOLECULAR RATIONALE FOR THE EFFECTS OF RIBOFLAVIN SUPPLEMENTATION J. Biol. Chem., February 13, 2009; 284(7): 4222 - 4229. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. A Reeves, D. Talavera, and J. M Thornton Genome and proteome annotation: organization, interpretation and integration J R Soc Interface, February 6, 2009; 6(31): 129 - 147. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. I. Martinez-Gomez, S. Martinez-Rodriguez, J. Pozo-Dengra, D. Tessaro, S. Servi, J. M. Clemente-Jimenez, F. Rodriguez-Vico, and F. J. Las Heras-Vazquez Potential Application of N-Carbamoyl-{beta}-Alanine Amidohydrolase from Agrobacterium tumefaciens C58 for {beta}-Amino Acid Production Appl. Envir. Microbiol., January 15, 2009; 75(2): 514 - 520. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. A. Bauer, S. Gunther, D. Jansen, C. Heeger, P. F. Thaben, and R. Preissner SuperSite: dictionary of metabolite and drug binding sites in proteins Nucleic Acids Res., January 1, 2009; 37(suppl_1): D195 - D200. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Goldenberg, E. Erez, G. Nimrod, and N. Ben-Tal The ConSurf-DB: pre-calculated evolutionary conservation profiles of protein structures Nucleic Acids Res., January 1, 2009; 37(suppl_1): D323 - D327. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. A. Laskowski PDBsum new things Nucleic Acids Res., January 1, 2009; 37(suppl_1): D355 - D359. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Capra and M. Singh Characterization and prediction of residues determining protein functional specificity Bioinformatics, July 1, 2008; 24(13): 1473 - 1480. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Liu, B. Pucci, M. Rossi, F. M. Pisani, and R. Ladenstein Structural analysis of the Sulfolobus solfataricus MCM protein N-terminal domain Nucleic Acids Res., June 1, 2008; 36(10): 3235 - 3243. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. A.C. Beck, A. L. Jonsson, R. D. Schaeffer, K. A. Scott, R. Day, R. D. Toofanny, D. O.V. Alonso, and V. Daggett Dynameomics: mass annotation of protein dynamics and unfolding in water by high-throughput atomistic molecular dynamics simulations Protein Eng. Des. Sel., June 1, 2008; 21(6): 353 - 368. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Bashton, I. Nobeli, and J. M. Thornton PROCOGNATE: a cognate ligand domain mapping for enzymes Nucleic Acids Res., January 11, 2008; 36(suppl_1): D618 - D622. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. H. Dessailly, M. F. Lensink, C. A. Orengo, and S. J. Wodak LigASite a database of biologically relevant binding sites in proteins with known apo-structures Nucleic Acids Res., January 11, 2008; 36(suppl_1): D667 - D673. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Huhne, F.-T. Koch, and J. Suhnel A comparative view at comprehensive information resources on three-dimensional structures of biological macro-molecules Brief Funct Genomic Proteomic, October 23, 2007; (2007) elm020v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Capra and M. Singh Predicting functionally important residues from sequence conservation Bioinformatics, August 1, 2007; 23(15): 1875 - 1882. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. A. Laskowski Enhancing the functional annotation of PDB structures in PDBsum using key figures extracted from the literature Bioinformatics, July 15, 2007; 23(14): 1824 - 1827. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. K. Saini and D. Fischer FRalanyzer: a tool for functional analysis of fold-recognition sequence-structure alignments Nucleic Acids Res., July 13, 2007; 35(suppl_2): W499 - W502. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. J. Kundrotas and E. Alexov PROTCOM: searchable database of protein complexes enhanced with domain-domain structures Nucleic Acids Res., January 12, 2007; 35(suppl_1): D575 - D579. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. L. Holliday, D. E. Almonacid, G. J. Bartlett, N. M. O'Boyle, J. W. Torrance, P. Murray-Rust, J. B. O. Mitchell, and J. M. Thornton MACiE (Mechanism, Annotation and Classification in Enzymes): novel tools for searching catalytic mechanisms Nucleic Acids Res., January 12, 2007; 35(suppl_1): D515 - D520. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Lopez, A. Valencia, and M. Tress FireDB--a database of functionally important residues from proteins of known structure Nucleic Acids Res., January 12, 2007; 35(suppl_1): D219 - D223. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. T.-H. Chang, Y.-Z. Weng, J.-H. Lin, M.-J. Hwang, and Y.-J. Oyang Protemot: prediction of protein binding sites with automatically extracted geometrical templates. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W303 - W309. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Arnold, L. Bordoli, J. Kopp, and T. Schwede The SWISS-MODEL workspace: a web-based environment for protein structure homology modelling Bioinformatics, January 15, 2006; 22(2): 195 - 201. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Mihalek, I. Res, and O. Lichtarge A structure and evolution-guided Monte Carlo sequence selection strategy for multiple alignment-based analysis of proteins Bioinformatics, January 15, 2006; 22(2): 149 - 156. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Karlinsey and K. T. Hughes Genetic Transplantation: Salmonella enterica Serovar Typhimurium as a Host To Study Sigma Factor and Anti-Sigma Factor Interactions in Genetically Intractable Systems J. Bacteriol., January 1, 2006; 188(1): 103 - 114. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Block, C. A. Sotriffer, I. Dramburg, and G. Klebe AffinDB: a freely accessible database of affinities for protein-ligand complexes from the PDB Nucleic Acids Res., January 1, 2006; 34(suppl_1): D522 - D526. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Sobolev, E. Eyal, S. Gerzon, V. Potapov, M. Babor, J. Prilusky, and M. Edelman SPACE: a suite of tools for protein structure prediction and analysis based on complementarity and environment Nucleic Acids Res., July 1, 2005; 33(suppl_2): W39 - W43. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. A. Laskowski, J. D. Watson, and J. M. Thornton ProFunc: a server for predicting protein function from 3D structure Nucleic Acids Res., July 1, 2005; 33(suppl_2): W89 - W93. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Ausiello, A. Zanzoni, D. Peluso, A. Via, and M. Helmer-Citterich pdbFun: mass selection and fast comparison of annotated PDB residues Nucleic Acids Res., July 1, 2005; 33(suppl_2): W133 - W137. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. Morris, R. J. Najmanovich, A. Kahraman, and J. M. Thornton Real spherical harmonic expansion coefficients as 3D shape descriptors for protein binding pocket and ligand comparisons Bioinformatics, May 15, 2005; 21(10): 2347 - 2355. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

PDB)










