Nucleic Acids Research, 2003, Vol. 31, No. 13 3404-3405
© 2003 Oxford University Press
SSEP: secondary structural elements of proteins
Bioinformatics Centre, Supercomputer Education and Research Centre, Indian Institute of Science, Bangalore 560 012, India
*To whom correspondence should be addressed. Tel: +91 803601409; Fax: +91 803600683; Email: sekar{at}physics.iisc.ernet.in
Received January 10, 2003; Revised and Accepted March 4, 2003
| ABSTRACT |
|---|
|
|
|---|
SSEP is a comprehensive resource for accessing information related to the secondary structural elements present in the 25 and 90% non-redundant protein chains. The database contains 1771 protein chains from 1670 protein structures and 6182 protein chains from 5425 protein structures in 25 and 90% non-redundant protein chains, respectively. The current version provides information about the
-helical segments and ß-strand fragments of varying lengths. In addition, it also contains the information about 310-helix, ß- and
-turns and hairpin loops. The free graphics program RASMOL has been interfaced with the search engine to visualize the three-dimensional structures of the user queried secondary structural fragment. The database is updated regularly and is available through Bioinformatics web server at http://cluster.physics.iisc.ernet.in/ssep/ or http://144.16.71.148/ssep/. | INTRODUCTION |
|---|
|
|
|---|
The structural and conformational properties of the secondary structural elements (SSEs) are of considerable interest in view of the occurrence of these fragments at various regions in the protein molecule. Analysis of the sequence motifs is greatly enhanced by determining the sequence relationships between the individual proteins. Identical or similar sequences are the most commonly used methods for the assignment of putative function to a newly solved protein structure(s) or gene(s). At present, the public domain Protein Data Bank (PDB) (1) has nearly 20 200 protein and nucleic acid structures. There are structural databases [for example, BIPED (2) and DSMP (3)] which contain information on some of the protein structural motifs. To the best of the knowledge of the authors, there is no search engine available to visualize the user interested SSEs and their neighboring environment and also to query the occurrence of similar motifs in gene sequences available in the genome database. The proposed software, SSEP, has been developed to address these issues in detail.
| MATERIALS AND METHODS |
|---|
|
|
|---|
This resource integrates and reports the information about the secondary structures present in the 25 and 90% non-identical protein chains (4). We built the database in two steps. Initially, we identified all the SSEs using a standard program PROMOTIF (5). Secondly, we used PERL scripts to collate and import the required information from the PROMOTIF output. In order to serve better and to minimize the redundant work for the research community, an efficient search engine has been developed to fetch the user-interested fragments using flexible options. The present SSEP interface is built using CGI/PERL scripts. The program SSEP is easy to use and runs on windows 95/98/2000, Windows NT, Linux and silicon graphics (SGI) platforms through the most popular web browsers Netscape (version 4.7) and Internet Explorer. The user needs to interface the graphics program RasMol (6) with the web browser (only for the first time) (see http://144.16.71.148/ssep/downloadrasmol.htm for instructions).
Currently SSEP contains the SSEs from 1771 and 6182 protein chains for 25 and 90% non-identical protein chains, respectively. The current version of the database contains 8275
-helical segments and 11406 ß-strand fragments and 33142
-helical and 49275 ß-strand fragments in 25 and 90% non-identical protein chains, respectively. For structures solved using NMR spectroscopy, only the first model was used to identify the SSEs as implemented in PROMOTIF (5).
| DATA PRESENTATION AND AVAILABILITY |
|---|
|
|
|---|
The SSEP has several of the following options for the users, namely: (a) search for
-helix; (b) search for 310-helix; (c) search for ß-strand; (d) search for ß-turns; (e) search for
-turns; (f) search for hairpin loops; (g) sequence pattern matching; and (h) advanced search facility.
Users can perform search using a variety of parameters including the type of experiment and resolution. Furthermore, the proposed search engine allows the users to search the occurrence of the user interested SSE in all the gene sequences available in the genome database. The user can visualize the three-dimensional structure of the motif using the graphics package RasMol (6) to get a better understanding about the location and neighboring environment of the user-interested SSEs. In addition, the users can save the atomic coordinates of the queried fragments in their local machine (hard disk in the client machine). A sample output of the result of a typical search for a particular penta-peptide motif QFNGM in all the secondary structural elements available in 25% non-identical protein structures is shown in Figure 1. This simple search aided in the recognition of the sequence motif available in the
-helix (Fig. 1). The RasMol graphics panel (Fig. 1) shows the location of the searched motif (ribbons) in the entire protein molecule (backbone trace).
|
| ACKNOWLEDGEMENTS |
|---|
The proposed database and the search engine has been developed and maintained at the Bionformatics Centre, Indian Institute of Science, Bangalore 560 012, India. The authors gratefully acknowledge the use of the Interactive Graphics Based Molecular Modelling (IGBMM) and the Supercomputer Education and Research Centre (SERC). We are grateful for individual project support from the Department of Biotechnology (DBT), Government of India, India. A part of this work is supported by the Institute wide Computational Genomics Program.
| REFERENCES |
|---|
|
|
|---|
- Bernstein,F.C., Koetzle,T.F., Williams,G.J.B., Meyer,E.F.,Jr, Brice,M.D., Rogers,J.R., Kennard,O., Shimanouchi,T. and Tasumi,M.J. (1977) The Protein Data Bank: a computer based archival file for macromolecular structures. J. Mol. Biol., 112, 535542.[ISI][Medline]
- Islam,S.A. and Sternberg,M.J.E. (1989) A relational database of protein structures designed for flexible enquiries about conformation. Protein Eng., 2, 431442.
[Abstract/Free Full Text] - Guruprasad,K., Prasad,M.S. and Kumar,G.R. (2000) Database of structural motifs in proteins. Bioinformatics, 16, 372375.
[Abstract/Free Full Text] - Hobohm,U. and Sander,C. (1994) Enlarged representative set protein structures. Protein Sci., 3, 522524.[Abstract]
- Hutchinson,E.G. and Thornton,J.M. (1996) PROMOTIFa program to identify and analyze secondary structural motifs in proteins. Protein Sci., 5, 212220.[Abstract]
- Sayle,R.A. and Milner-Whilte,E.J. (1995) RASMOL: biomolecular graphics for all. Trends Biochem. Sci., 20, 374382.[CrossRef][ISI][Medline]
This article has been cited by other articles:
![]() |
J. E. Gewehr and R. Zimmer SSEP-Domain: protein domain prediction by alignment of secondary structure elements and profiles Bioinformatics, January 15, 2006; 22(2): 181 - 187. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

