Article |
CASTp: computed atlas of surface topography of proteins with structural and topographical mapping of functionally annotated residues
Program in Bioinformatics, Department of Bioengineering, University of Illinois at Chicago Chicago, IL 60612, USA
*To whom correspondence should be addressed. Tel: +1 312 355 1789; Fax: +1 312 413 2 18; Email: jliang{at}uic.edu
Received February 9, 2006. Revised March 4, 2006. Accepted April 5, 2006.
| ABSTRACT |
|---|
|
|
|---|
Cavities on a proteins surface as well as specific amino acid positioning within it create the physicochemical properties needed for a protein to perform its function. CASTp (http://cast.engr.uic.edu) is an online tool that locates and measures pockets and voids on 3D protein structures. This new version of CASTp includes annotated functional information of specific residues on the protein structure. The annotations are derived from the Protein Data Bank (PDB), Swiss-Prot, as well as Online Mendelian Inheritance in Man (OMIM), the latter contains information on the variant single nucleotide polymorphisms (SNPs) that are known to cause disease. These annotated residues are mapped to surface pockets, interior voids or other regions of the PDB structures. We use a semi-global pair-wise sequence alignment method to obtain sequence mapping between entries in Swiss-Prot, OMIM and entries in PDB. The updated CASTp web server can be used to study surface features, functional regions and specific roles of key residues of proteins.
| INTRODUCTION |
|---|
|
|
|---|
Characterizing protein functions is an increasingly important challenging problem that has been approached from both the sequence and structure levels. The fact that only 4922 of the 35 000 Protein Data Bank (PDB) (1) structures contain any type of functional annotation illustrates the widening gap between our ability to resolve the proteins structure and our ability to locate functionally important residues and to obtain a comprehensive understanding of the structural basis of protein function. The 3D structure of a protein and its surface topography can provide important information for understanding protein function, if a broad knowledge base of the functionally important residues and where they are located on the protein structures is provided. This update of the CASTp web server incorporates functional information about a large set of annotated residues on PDB structures obtained from annotations in PDB, Swiss-Prot and Online Mendelian Inheritance in Man (OMIM).
This paper is organized as follows. We will first discuss our method for mapping annotated residues from Swiss-Prot and OMIM onto the PDB structure. We will then describe updates to the CASTp (2,3) web server for visualization of the annotated functional residues, with emphasis on mapping to surface pockets and interior voids. We will conclude with description of additional updates to the CASTp web server.
| MATERIALS AND METHODS |
|---|
|
|
|---|
Swiss-Prot mapping method
The numbered positions of annotated residues in the Swiss-Prot sequence often do not align to the same numbered positions of the sequence from the PDB structure. Therefore, a mapping of positions between the Swiss-Prot sequence and the PDB sequence must be obtained. We use a variation of the Needleman and Wunsch algorithm to identify if a sequence of a PDB structure can be found to match the sequence containing annotated residues from the Swiss-Prot database.
Specifically, every Swiss-Prot sequence containing one or more annotated residues and a link to a PDB structure was aligned to the corresponding sequence of the PDB structure. Standard annotations of Swiss-Prot used include post-translational modifications (MOD_RES), covalent binding of a lipid moiety (LIPID), glycosylation sites (CARBOHYD), post-translational formed amino acid bonds (CROSSLNK), metal binding sites (METAL), chemical group binding sites (BINDING), calcium binding regions (CA_BIND), DNA binding regions (DNA_BIND), nucleotide phosphate binding regions (NP_BIND), zinc finger regions (ZN_FING), enzyme activity amino acids (ACT_SITE) and any interesting single amino acid site (SITE). To ensure that the mapping is accurate, only alignments of two sequences with a sequence identity greater than ninety five percent were used. The annotated positions from Swiss-Prot are then transferred onto the PDB sequence, as long as the position is not aligned to a gap.
OMIM mapping method
Variant alleles that are known to be disease causing and are SNPs were selected from the OMIM (4). These OMIM entries that contain links to Swiss-Prot database were mapped onto the Swiss-Prot (5) sequence by measuring the relative distances in residue position between the OMIM alleles and then identifying the corresponding pairs of SNPs in the Swiss-Prot entry. If the Swiss-Prot entry identified the corresponding PDB entry, the sequence was extracted and aligned to the PDB structure using a semi-global pair-wise sequence alignment method. We follow Stitziel et al. (6,7) for the mapping between OMIM and PDB entries.
| RESULTS |
|---|
|
|
|---|
Mapping results
There are 113 928 annotated residues in 4, 922 structures labeled in PDB records. The transfer of 241 913 Swiss-Prot annotations added 226 177 unique annotations to 15 913 PDB structures. Of those structures, 13 094 did not previously have any annotation contained in the PDB records. Table 1 lists the type of Swiss-Prot annotations, number of PDB structures the annotation is found in, and the total number of annotated residues. Of the 15 661 BINDING residues, we were able to map 11 407 (81%) of them to a pocket or a void on the protein structure. We were also able to map 14 829 (74%) of the ACT_SITE sites of enzymes to an existing protein pocket. Additional computation can further raise these percentages (data not shown).
|
From the original set of 5467 nsSNPs in 1061 alleles, the mapping of OMIM disease mutations added 2128 annotated residues on 310 PDB structures. Of those 2128 variants, only 254 are mapped onto an annotation from either PDB or Swiss-Prot. This is reasonable, as it is possible that these mutations in some cases cause disease by disrupting the proteins structural stability rather than interrupting their functional interactions with other molecules. The database of all annotated residues from PDB, Swiss-Prot and OMIM can be downloaded from the CASTp web server.
Visualizing annotated residues in CASTp
In addition to file downloads, CASTp allows for interactive visualization of biologically important annotated residues by querying the CASTp server using a four letter PDB protein name, Swiss-Prot or GenBank identification. A new database of CASTp calculations of single chains of a multiple chain complex can also be queried by adding the chain identifier to the PDB protein name. Figure 1 shows the atoms of the charge relay system that resides in a functional pocket of serine protease/inhibitor (PDB 1a2c). The atoms of annotated residues that lie in the pocket are highlighted in red in contrast to the green pocket atoms. A table of all the annotated residues are also displayed on the right hand side of the browser window. This table reports the following information: the database from which the annotation was derived from, the annotation key word from the database, the position of the annotation on the sequence of the PDB structure, the three letter amino acid code of the annotated residue, the identifications of the pocket/pockets the annotated residue is located and a brief description of the annotation. If the user chooses to have the results emailed, a text file will be sent that contains all the information listed in the above table.
|
Calculation requests
In addition to querying a database of single chain calculations, the Calculation Request page allows the user to run a calculation on any combination of chains from a multiple chain complex. If the protein contains HET groups, the user is also given the option to include any combination of the HET groups in the calculation.
Improved visualization
For visualizing annotated residues, the JMOL plug-in (http://www.jmol.org) is now added as a visualization option. JMOL runs on Windows/Mac OS X/Linux and only requires a java enabled browser. The result is added functionality and a friendlier user interface.
The user is now also presented with a corresponding sequence map, where residues in highlighted pocket are highlighted in the same color as in the structural visualization. In addition, a user has finer control. The user is able to change the pocket colorings, the display of the PDB structure in wireframe, cartoon, strands or ribbons. The user can also send customized rasmol scripts to the Chime visualization.
| DISCUSSION |
|---|
|
|
|---|
This paper describes major updates to the CASTp web server. Biologically important functional residues annotated from three sources are now mapped to PDB structures and visualization is provided. We believe these updates significantly increases the information content of CASTp and enhances our knowledge base needed for studying structural basis of protein functions.
| AVAILABILITY |
|---|
|
|
|---|
CASTp web server and the associated mapping database can be freely accessed on the World Wide Web at http://cast.engr.uic.edu.
| ACKNOWLEDGEMENTS |
|---|
Funding to pay the Open Access publication charges for this article was provided by grants from National Science Foundation (CAREER DBI0133856), National Institute of Health (GM68958),and Office of Naval Research (N00014-06-1-0100).
Conflict of interest statement. None declared.
| Footnotes |
|---|
Present addresses: Andrew Binkowski, Argonne National Laboratories, Argonne, IL 60439 USA
Yaron Turpaz, Affymetrix, Inc., Santa Clara, CA 95051, USA
| REFERENCES |
|---|
|
|
|---|
- Berman, H.M., Westbrook, J., Feng, Z., Gilliland, G., Bhat, T.N., Weissig, H., Shindyalov, I.N., Bourne, P.E. (2000) The Protein Data Bank Nucleic Acids Res, . 28, 235242
[Abstract/Free Full Text] . - Binkowski, T.A., Naghibzadeh, S., Liang, J. (2003) CASTp: computed atlas of surface topography of proteins Nucleic Acids Res, . 31, 33523355
[Abstract/Free Full Text] . - Liang, J., Edelsbrunner, H., Woodward, C. (1998) Anatomy of protein pockets and cavities: measurement of binding site geometry and implications for ligand design Protein Sci, . 7, 18841897[Web of Science][Medline] .
- McKusick, V.A. Mendelian Inheritance in Man. A Catalog of Human Genes and Genetic Disorders, 12th edn, (1998) Baltimore Johns Hopkins University Press .
- Gasteiger, E., Gattiker, A., Hoogland, C., Ivanyi, I., Appel, R.D., Bairoch, A. (2003) ExPASy: the proteomics server for in-depth protein knowledge and analysis Nucleic Acids Res, . 31, 37843788
[Abstract/Free Full Text] . - Stitziel, N., Tseng, Y.Y., Pervouchine, D., Goddeau, D., Kasif, S., Liang, J. (2003) Structural location of disease-associated single-nucleotide polymorphisms JMB, 327, 10211030 .
- Stitziel, N., Binkowski, T.A., Tseng, Y.Y., Kasif, S., Liang, J. (2004) topoSNP: a topographic database of non-synonymous single nucleotide polymorphisms with and without known disease association Nucleic Acids Res, . 32, D520D522
[Abstract/Free Full Text] .
This article has been cited by other articles:
![]() |
M. A. Argiriadi, T. Xiang, C. Wu, T. Ghayur, and D. W. Borhani Unusual Water-mediated Antigenic Recognition of the Proinflammatory Cytokine Interleukin-18 J. Biol. Chem., September 4, 2009; 284(36): 24478 - 24489. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sainsbury, L. A. Lane, J. Ren, R. J. Gilbert, N. J. Saunders, C. V. Robinson, D. I. Stuart, and R. J. Owens The structure of CrgA from Neisseria meningitidis reveals a new octameric assembly state for LysR transcriptional regulators Nucleic Acids Res., August 1, 2009; 37(14): 4545 - 4558. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Y. Tseng, C. Dupree, Z. J. Chen, and W.-H. Li SplitPocket: identification of protein functional surfaces and characterization of their spatial patterns Nucleic Acids Res., July 1, 2009; 37(suppl_2): W384 - W389. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Pavelka, E. Chovancova, and J. Damborsky HotSpot Wizard: a web server for identification of hot spots in protein engineering Nucleic Acids Res., July 1, 2009; 37(suppl_2): W376 - W383. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. J. Yao, G. Velez Ruiz, M. R. Whorton, S. G. F. Rasmussen, B. T. DeVree, X. Deupi, R. K. Sunahara, and B. Kobilka The effect of ligand efficacy on the formation and stability of a GPCR-G protein complex PNAS, June 9, 2009; 106(23): 9501 - 9506. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Tuncbag, G. Kar, O. Keskin, A. Gursoy, and R. Nussinov A survey of available tools and web servers for analysis of protein-protein interactions and interfaces Brief Bioinform, May 1, 2009; 10(3): 217 - 232. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Cellini, R. Montioli, A. Paiardini, A. Lorenzetto, and C. B. Voltattorni Molecular Insight into the Synergism between the Minor Allele of Human Liver Peroxisomal Alanine:Glyoxylate Aminotransferase and the F152I Mutation J. Biol. Chem., March 27, 2009; 284(13): 8349 - 8358. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Tomita, T. Sato, K. Ichiyanagi, S. Nozawa, H. Ichikawa, M. Chollet, F. Kawai, S.-Y. Park, T. Tsuduki, T. Yamato, et al. Visualizing breathing motion of internal cavities in concert with ligand migration in myoglobin PNAS, February 24, 2009; 106(8): 2612 - 2616. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. M. Zajonc, H. Striegl, C. C. Dascher, and I. A. Wilson The crystal structure of avian CD1 reveals a smaller, more primordial antigen-binding pocket compared to mammalian CD1 PNAS, November 18, 2008; 105(46): 17925 - 17930. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. M. Zimmer, J. Liu, J. L. Clayton, D. S. Stephens, and J. P. Snyder Paclitaxel Binding to Human and Murine MD-2 J. Biol. Chem., October 10, 2008; 283(41): 27916 - 27926. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Rodriguez-Almazan, R. Arreola, D. Rodriguez-Larrea, B. Aguirre-Lopez, M. T. de Gomez-Puyou, R. Perez-Montfort, M. Costas, A. Gomez-Puyou, and A. Torres-Larios Structural Basis of Human Triosephosphate Isomerase Deficiency: MUTATION E104D IS RELATED TO ALTERATIONS OF A CONSERVED WATER NETWORK AT THE DIMER INTERFACE J. Biol. Chem., August 22, 2008; 283(34): 23254 - 23263. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. F. Gherardini and M. Helmer-Citterich Structure-based function prediction: approaches and applications Brief Funct Genomic Proteomic, July 3, 2008; (2008) eln030v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
T.-Y. Chien, D. T.-H. Chang, C.-Y. Chen, Y.-Z. Weng, and C.-M. Hsu E1DS: catalytic site prediction based on 1D signatures of concurrent conservation Nucleic Acids Res., July 1, 2008; 36(suppl_2): W291 - W296. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. Hernandez Prada, A. J. Ferreira, M. J. Katovich, V. Shenoy, Y. Qi, R. A.S. Santos, R. K. Castellano, A. J. Lampkins, V. Gubala, D. A. Ostrov, et al. Structure-Based Identification of Small-Molecule Angiotensin-Converting Enzyme 2 Activators as Novel Antihypertensive Agents Hypertension, May 1, 2008; 51(5): 1312 - 1317. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Chopra, R. M. Dooling, C. G. Horner, and E. E. Howell A Balancing Act between Net Uptake of Water during Dihydrofolate Binding and Net Release of Water upon NADPH Binding in R67 Dihydrofolate Reductase J. Biol. Chem., February 22, 2008; 283(8): 4690 - 4698. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Ploskon, C. J. Arthur, S. E. Evans, C. Williams, J. Crosby, T. J. Simpson, and M. P. Crump A Mammalian Type I Fatty Acid Synthase Acyl Carrier Protein Domain Does Not Sequester Acyl Chains J. Biol. Chem., January 4, 2008; 283(1): 518 - 528. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Grosskinsky, M. Schutz, M. Fritz, Y. Schmid, M. C. Lamparter, P. Szczesny, A. N. Lupas, I. B. Autenrieth, and D. Linke A Conserved Glycine Residue of Trimeric Autotransporter Domains Plays a Key Role in Yersinia Adhesin A Autotransport J. Bacteriol., December 15, 2007; 189(24): 9011 - 9019. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Lundgren, B. Andersen, J. Piskur, and D. Dobritzsch Crystal Structures of Yeast -Alanine Synthase Complexes Reveal the Mode of Substrate Binding and Large Scale Domain Closure Movements J. Biol. Chem., December 7, 2007; 282(49): 36037 - 36047. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Di Matteo, S. Gianni, M. E. Schinina, A. Giorgi, F. Altieri, N. Calosci, M. Brunori, and C. Travaglini-Allocatelli A Strategic Protein in Cytochrome c Maturation: THREE-DIMENSIONAL STRUCTURE OF CcmH AND BINDING TO APOCYTOCHROME c J. Biol. Chem., September 14, 2007; 282(37): 27012 - 27019. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Xu, B. Benoff, H.-L. Liou, P. Lobel, and A. M. Stock Structural Basis of Sterol Binding by NPC2, a Lysosomal Protein Deficient in Niemann-Pick Type C2 Disease J. Biol. Chem., August 10, 2007; 282(32): 23525 - 23531. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. S. De Silva, G. Kovacikova, W. Lin, R. K. Taylor, K. Skorupski, and F. J. Kull Crystal Structure of the Vibrio cholerae Quorum-Sensing Regulatory Protein HapR J. Bacteriol., August 1, 2007; 189(15): 5683 - 5691. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Nakagawa, A. Hasegawa, J. Hiratake, and K. Sakata Engineering of Pseudomonas aeruginosa lipase by directed evolution for enhanced amidase activity: mechanistic implication for amide hydrolysis by serine hydrolases Protein Eng. Des. Sel., July 6, 2007; (2007) gzm025v1. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||








