Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (394K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (6)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Sheik, S. S.
Right arrow Articles by Sekar, K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Sheik, S. S.
Right arrow Articles by Sekar, K.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2003, Vol. 31, No. 1 448-451
© 2003 Oxford University Press

CADB: Conformation Angles DataBase of proteins

S. S. Sheik, P. Ananthalakshmi, G. Ramya Bhargavi and K. Sekar*

Bioinformatics Centre, Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore 560 012, India

*To whom correspondence should be addressed. Tel: +91 803601409; Fax: +91 803600683; Email: sekar{at}physics.iisc.ernet.in

Received July 25, 2002; Revised August 27, 2002. Accepted September 9, 2002

ABSTRACT

Conformation Angles DataBase (CADB) provides an online resource to access data on conformation angles (both main-chain and side-chain) of protein structures in two data sets corresponding to 25% and 90% sequence identity between any two proteins, available in the Protein Data Bank. In addition, the database contains the necessary crystallographic parameters. The package has several flexible options and display facilities to visualize the main-chain and side-chain conformation angles for a particular amino acid residue. The package can also be used to study the interrelationship between the main-chain and side-chain conformation angles. A web based JAVA graphics interface has been deployed to display the user interested information on the client machine. The database is being updated at regular intervals and can be accessed over the World Wide Web interface at the following URL: http://144.16.71.148/cadb/.

INTRODUCTION

Knowledge of the conformation angles, in general, is important and is a pre-requisite during molecular modelling. The structural and conformational properties of the polypeptide chain containing the naturally occurring 20 different amino acids are of considerable interest in view of the occurrence of the residues at various regions of the protein molecule. Over the last few decades, in particular, major advances have been made in both the main-chain and side-chain conformation angles. Ramachandran plot is the display of the ({phi}, {Psi}) angle pairs of a polypeptide chain in a given protein structure (1). Over the years, structural crystallographers and the protein modellers use the Ramachandran plot during every stages of model building to see the stereo chemical feasibility of the main-chain torsion angles ({phi}, {Psi}).

The concept of Ramachandran plot has been deployed to unravel the role of the individual amino acid residues in protein folding. Using the principles of the Ramachandran plot, considerable attention has been paid to elucidate the influence of the neighbouring amino acid residues (n-1) and (n+1) on the conformation of the middle residue n (24). Side-chain conformation angles adopted by various amino acid residues play a crucial role in determining the overall tertiary structure of the protein molecule. Owing to its importance, several authors (for example 5) have attempted considerably and derived their probable torsion angles. Thus, these values are useful in modelling the corresponding atoms in the protein structure. Due to recent advances in the data collection techniques and the computing methods, the number of protein structures in the Protein Data Bank (6) is increasing rapidly. The Protein Data Bank is the single unique archive to store all the protein structures solved to date. Thus, nearly 18 500 protein and nucleic acid structures are currently available in this entity. To the best of the knowledge of the authors, there is no web-based database available to study the conformation angles with user settable parameters using a search engine. However, there are rotamer libraries available on the web to study the conformation angles (711). The present paper addresses these issues in detail with several useful options. Therefore, we have computed all possible conformation angles for the protein chains present in two different data sets corresponding to less than 25% and 90% sequence identity derived by Hobohm and Sander, EMBL, Heidelberg, Germany (12). Also, a web-based Graphical User Interface (GUI) has been implemented to search the conformation angles database (CADB) more efficiently and also to trawl through the values provided in the database to get the required information in the form of convenient plots. In addition, the downstream options provided in the plot shows the analysis in a more comprehensive and user understandable format.

MATERIALS AND METHODS

The current version of CADB contains 6146 protein chains from 5402 protein structures for the 90% non-redundant protein sequences. In the case of 25% non-homologous data set, there are 1739 protein chains from 1654 protein structures. All the relevant information pertaining to the protein chains and residues (example PDB-id code, resolution, R-factor and the isotropic temperature factor etc.) are also provided in the database.

Initially, we planned to house the entire data in a single ASCII (readable text) file. However, due to the continued growth of protein structures in the Protein Data Bank (6), we have decided to park the resource in MySQL, a relational database management system (RDBMS). This allows queries that are more complex and addresses efficient maintenance issues. The database CADB contains protein structures, solved using the physical techniques, X-ray crystallography and NMR spectroscopy. The package is written using PERL and Java applet programming has been implemented for display purpose. The front-end data input part of this package is written in HTML and JavaScript, which allows user-friendly web forms. The database is very easy to use and can be accessed using Windows 95/98/2000, Windows NT server, Linux and silicon graphics (SGI) platforms with the NETSCAPE browser. The software and the database are hosted on our Bioinformatics linux cluster server (1.7 GHz Pentium IV processor, 512 MB of Random Access Memory, Redhat Linux 7.2).

ORGANIZATION AND UTILITIES OF CADB

The following three options are available in the CADB.

(i) Main-chain conformation angles.
(ii) Side-chain conformation angles.
(iii) Interrelationship angles plot (any angle versus any angle).

The users need to select the experiment type (X-ray diffraction or NMR) for all the above said options. In addition, the users need to select the data type (25% or 90% non-redundant protein chains). For each of the above said options, the users have full control to select several quality-check criteria (resolution cut-off, R-factor cut-off and temperature factor cut-off) to study the behaviour of the residues under various situations. In the case of structures solved using NMR, the user has the option to perform their calculation based on the first model or the average structure from the ensemble of models. As implemented in the program PROCHECK (13), the boundary regions of the fully, additionally and generously allowed regions are also marked in the plot. Once the plot is available on the client machine, the users can get the Ramachandran angles of a particular amino acid residue by clicking a residue in the list box provided at the right side of the plot and then the corresponding point is highlighted. Similarly, the users can get the complete information about the amino acid residue by clicking a point marked in the plot (Fig. 1). The values plotted in the Ramachandran plot are also available by clicking the button ‘Report’. The analysis button provides a detailed output such as the number of residues present in the various secondary structural regions marked in the plot. These options are very useful in finding the occurrence of the individual residues at various regions of the plot. The definition of the torsion angles conforms the convention of IUPAC-IUB (14) and the angles are calculated between -180° to +180°. The print option allows the users to take a print out of the plot. Newman projection display technique has been deployed to show the distribution of side-chain conformation angles (Fig. 2). In the trial runs, the plot appears in about 30–40 s depending upon the users input criteria and the network traffic. The CADB is an evolving resource and can be used as a research tool for conformation angles analysis. The CADB is freely accessible rich information resource, which is updated at regular intervals. New features will be added based on the request from the scientific community around the world. Users of CADB are asked to cite this article in their publication.



View larger version (34K):
[in this window]
[in a new window]
 
Figure 1. A sample output showing the ({phi}, {Psi})—plot for the residue Aspartate in all the 25% non-redundant protein chains. The highlighted (encircled) point in the plot corresponds to the residue Asp A 76 in the pdb-id code: 1k0M. The cut-off criterion used to generate the Ramachandran plot is also shown in the plot.

 


View larger version (34K):
[in this window]
[in a new window]
 
Figure 2. The output frame depicts all possible side chain conformation angles for the residue Arginine in the 90% non-redundant protein chains. The cut-off criterion used here is as follows: (Resolution cut-off: 1.5 Å, R-factor cut-off: 20.0% and the temperature factor cut-off: 15 Å2).

 
FUTURE PERSPECTIVES

The database CADB is scheduled to be updated as and when the data on the 25% and 90% non-homologous protein structures is available from the Hobohm and Sander FTP site, EMBL, Heidelberg, Germany. We also plan to provide additional options and make a number of improvements in the database (for example, to distinguish Cysteines based on their oxidation state). In addition, we will upgrade our existing hardware to improve speed.

ACKNOWLEDGEMENTS

The CADB is developed and maintained at the Indian Institute of Science, Bangalore 560 012, India with support from the Department of Biotechnology, Government of India. The authors gratefully acknowledge the use of the Bioinformatics centre (DIC); the Interactive Graphics Based Molecular Modelling (IGBMM) and the Supercomputer Education and Research Centre (SERC). The facilities DIC and IGBMM are supported by the Department of Biotechnology (DBT), Government of India, India. We are grateful for individual project support from the DBT (K.S.). One of the authors (K.S.) thanks Dr N. Srinivasan for his critical comments on the manuscript.

REFERENCES

  1. Ramachandran,G.N., Ramakrishnan,C. and Sasisekharan,V. (1963) Conformation of polypeptides and proteins. J. Mol. Biol., 7, 95–99.[Web of Science][Medline]

  2. Kabat,E.A. and Wu,T.T. (1972) Construction of a three-dimensional model of the polypeptide backbone of the variable region of kappa immunoglobulin light chain. Proc. Natl Acad. Sci. USA, 69, 960–964.[Abstract/Free Full Text]

  3. Wu,T.T. and Kabat,E.A. (1971) An attempt to locate the non-helical and permissively helical sequences of proteins: application to the variable regions of immunoglobulin light and heavy chains. Proc. Natl Acad. Sci. USA, 68, 1501–1506.[Abstract/Free Full Text]

  4. Wu,T.T. and Kabat,E.A. (1973) The influence of nearest neighboring amino acid residues on aspects of secondary structure of proteins. Attempts to locate alpha-helices and beta-sheets. J. Mol. Biol., 75, 13–31.[CrossRef][Web of Science][Medline]

  5. Chakrabarti,P. and Pal,D. (2001) The interrelationships of side-chain and main-chain conformations in proteins. Prog. Biophy. Mol. Biol., 76, 1–102.[CrossRef][Web of Science][Medline]

  6. Westbrook,J., Feng,Z., Jain,S., Bhat,T.N., Thanki,N., Gilliland,G., Ravichandran,V., Bluhm,W.F., Weissig,H., Greer,S., Bourne,P.E. and Berman,H.M. (2002) The Protein Data Bank: unifying the archive. Nucleic Acids Res., 28, 235–242.[Abstract/Free Full Text]

  7. Lovell,S.C., Word,J.M., Richardson,J.S. and Richardson,D.C. (2000) The Penultimate Rotamer Library. Proteins Struct. Func. Gen., 40, 389–408.

  8. Tuffery,P., Etchebest,C. and Hazout,S. (1997) Prediction of protein side-chain conformations: a study on the influence of backbone accuracy on conformation stability in the rotamer space. Protein Eng., 10, 361–372.[Abstract/Free Full Text]

  9. Bower,M.J., Cohen,F.E. and Dunbrack,R.L., Jr (1997) Prediction of protein side-chain conformations from a backbone-dependent rotamer library: a new homology modelling tool for proteins: application to side-chain prediction. J. Mol. Biol., 267, 1268–1282.[CrossRef][Web of Science][Medline]

  10. Dunbrack, Jr,R.L., and Cohen,F.E. (1997) Bayesian statistical analysis of protein side-chain rotamer preferences. Protein Sci., 6, 1661–1681.[Web of Science][Medline]

  11. Ponder,J.W. and Richards,F.M. (1987) Tertiary templates for proteins. Use of Packing criteria in the enumeration of allowed sequences for different structural classes. J. Mol. Biol., 193, 775–791.[CrossRef][Web of Science][Medline]

  12. Hobohm,U. and Sander,C. (1994) Enlarged representative set of protein structures. Protein Sci., 3, 522–524.[Web of Science][Medline]

  13. Laskowski,R.A., MacArthur,M.W. and Thornton,J.M. (2001) ‘PROCHECK: Validation of Protein Structure coordinates’. In Rossmann, M.G. and Arnold, E. (eds), International Tables of Crystallography, Vol. F. Crystallography of Biological Macromolecules. Kluwer Academic Publishers, The Netherlands, pp. 722–725.

  14. IUPAC-IUB Commission on Biochemical Nomenclature (1970) Abbreviations and symbols for the description of the conformation of polypeptide chains. J. Mol. Biol., 52, 1–17.[CrossRef][Web of Science][Medline]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Nucleic Acids ResHome page
D. S. Berkholz, P. B. Krenesky, J. R. Davidson, and P. A. Karplus
Protein Geometry Database: a flexible engine to explore backbone conformations and their relationships to covalent geometry
Nucleic Acids Res., November 11, 2009; (2009) gkp1013v1.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (394K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (6)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Sheik, S. S.
Right arrow Articles by Sekar, K.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Sheik, S. S.
Right arrow Articles by Sekar, K.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?