Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (93K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (65)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Liu, J.
Right arrow Articles by Rost, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Liu, J.
Right arrow Articles by Rost, B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2003, Vol. 31, No. 13 3833-3835
© 2003 Oxford University Press

NORSp: predictions of long regions without regular secondary structure

Jinfeng Liu1,3,4 and Burkhard Rost*,1,2,3

1 CUBIC, Department of Biochemistry and Molecular Biophysics, Columbia University, 650 West 168th Street BB217, New York, NY 10032, USA 2 Columbia University Center for Computational Biology and Bioinformatics (C2B2), Russ Berrie Pavilion, 1150 St Nicholas Avenue, New York, NY 10032, USA 3 North East Structural Genomics Consortium (NESG), Department of Biochemistry and Molecular Biophysics, Columbia University, 650 West 168th Street BB217, New York, NY 10032, USA 4 Department of Pharmacology, Columbia University, 630 West 168th Street, New York, NY 10032, USA

*To whom correspondence should be addressed. Tel: +1 2123054018; Fax: +1 2123057932; Email: rost{at}columbia.edu

Received February 14, 2003; Revised and Accepted March 17, 2003


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 DESIGN AND IMPLEMENTATION
 INPUT, OUTPUT AND ADVANCED...
 REFERENCES
 
Many structurally flexible regions play important roles in biological processes. It has been shown that extended loopy regions are very abundant in the protein universe and that they have been conserved through evolution. Here, we present NORSp, a publicly available predictor for disordered regions in protein. Specifically, NORSp predicts long regions with NO Regular Secondary structure. Upon user submission of a protein sequence, NORSp will analyse the protein for its secondary structure, presence of transmembrane helices and coiled-coil. It will then return email to the user about the presence and position of disordered regions. NORSp can be accessed from http://cubic.bioc.columbia.edu/services/NORSp/.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 DESIGN AND IMPLEMENTATION
 INPUT, OUTPUT AND ADVANCED...
 REFERENCES
 
Irregular structures mediate function
The three-dimensional (3D) structure of a protein is assumed to largely determine its biological function. The first decades of rapid progress in the experimental determination of 3D structures by X-ray crystallography (1) focused on determining ‘rigid’ structures at high resolution. Recently, a new type of structure has emerged with very long regions that appear to adopt regular structure only upon binding to substrates or other proteins (2); they are referred to as floppy, natively disordered, natively unfolded or loopy (3,47). It seems that these irregular regions are important for function.

Predicting irregular structures
Structural irregularity can be studied from several aspects: one class of ‘natively disordered’ regions was defined as the regions invisible in electron density maps of X-ray diffraction, presumably since the flexibility keeps them from crystallising into well-ordered structures. These regions sometimes are associated with regions with ‘compositional bias’ or ‘low sequence complexity’ (810). Another class is characterised by proteins that appear unfolded by CD measurements (5). Previously, we investigated the problem of disordered proteins from a structure-oriented perspective and studied extended regions of very low regular secondary structure (helix or strand) content (NORS) (3). We showed that NORS regions are particularly abundant in eukaryotic proteomes, conserved during evolution, over-represented in regulatory function category and important in protein–protein interactions. These results were in agreement with studies that predicted ‘natively disordered regions’ through neural networks (11).

Here, we introduced a web-based interface to make our method of predicting NORS regions publicly accessible. The method can be useful for biologists in several ways. For example, crystallographers can check whether their proteins contain NORS regions and make the decision about whether to proceed with the experiments since NORS proteins may be difficult to crystallise, as demonstrated by their low occurrence in PDB (3). Biologists interested in protein structure–function relationship may also find it interesting to verify whether the protein–protein interaction sites coincide with NORS regions.


    DESIGN AND IMPLEMENTATION
 TOP
 ABSTRACT
 INTRODUCTION
 DESIGN AND IMPLEMENTATION
 INPUT, OUTPUT AND ADVANCED...
 REFERENCES
 
Definition of NORS
We defined NORS regions as segments of >70 consecutive residues with <12% of the residues in helix, strand or coiled-coil regions and with at least one segment of 10 adjacent residues exposed to solvent. We identify such NORS regions by merging predictions of secondary structure, transmembrane helices and coiled-coil regions. We pre-calculate this information as well as NORS regions for each protein in >60 completely sequenced genomes (Fig. 1), and have included them in our PEP database (12) through a searchable SRS (13) interface (http://cubic.bioc.columbia.edu/db/PEP/). NORS information has also been used in our target selection process for North East Structural Genomics Consortium (14) to exclude proteins likely to pose problems to crystallisation.



View larger version (19K):
[in this window]
[in a new window]
 
Figure 1. NORS proteins are much more abundant in eukaryotes than in prokaryotes and archae-bacteria. Shown in the graph are the average percentages of NORS proteins in the three kingdoms. Error bars indicate the maximum and minimum values.

 
Prediction by NORSp
Protein sequences submitted to our web site are subjected to the following steps. (a) Build sequence profile through a database search with an automated, iterated PSI-BLAST (15). (b) Secondary structure and solvent accessibility are predicted by PROFphd (16), membrane helices are predicted by the PHDhtm (17) using the PSI-BLAST profiles. (c) Coiled-coil regions are predicted by COILS (18). (d) The secondary structure, membrane helices and coiled-coil information are then combined to calculate the structural content for each sequence window of a certain length, and NORS regions are identified when the structural content is below the given threshold; overlapping NORS regions are joined. Technically, to obtain most of these intermediate results, NORSp utilises the same engine which is behind the PredictProtein server (19) (http://cubic.bioc.columbia.edu/predictprotein/).


    INPUT, OUTPUT AND ADVANCED OPTIONS
 TOP
 ABSTRACT
 INTRODUCTION
 DESIGN AND IMPLEMENTATION
 INPUT, OUTPUT AND ADVANCED...
 REFERENCES
 
Input
The input to NORSp is protein sequence; proteins shorter than 70 residues are returned unprocessed. Currently, the valid input format is a sequence in one-letter residue code or a FASTA-format. The sequence can be entered into the sequence text box or uploaded from users' local disk.

Output
Users have the option of receiving ‘succinct’ output, which only shows the position of the NORS region in the context of the submitted sequence, or ‘verbose’ output, which includes the intermediate data used by NORSp: secondary structure, solvent accessibility, transmembrane helices and coiled-coil prediction. By default, the results will be in plain text (ASCII) format. However, HTML formatted results can also be requested that can be displayed in any web browser. Due to concerns about file size and user mailbox overflow, the results will normally be available to download from our website and only URLs are sent to the users by email unless users request the full results being sent directly.

Recommendation and advanced options
We determined the particular threshold used to define NORS regions in order to minimise the false positive rate as determined by manually inspecting PDB proteins (3). This conservative solution implies that the vast majority of NORS regions that we detect are likely to constitute structurally irregular, floppy, loopy or natively disordered regions. However, we supposedly miss many such regions in our predictions. Users who are aware of this may be interested in changing the threshold to see which regions may be good candidates for irregular regions although not detected by our default. We provide three options for advanced users: the size of sequence window for calculating secondary structure content (default=70), maximum of secondary structure content (default=12%) and the minimum length of consecutive exposed residues (default=10).


    ACKNOWLEDGEMENTS
 
We are grateful to Hepan Tan (Columbia) for his help in developing the tool. This work was supported by grants 1-P50-GM62413-01 and RO1-GM63029-01 from the National Institute of Health (NIH). Last, but not least, thanks go to all those who deposit their experimental data in public databases and to those who maintain these databases.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 DESIGN AND IMPLEMENTATION
 INPUT, OUTPUT AND ADVANCED...
 REFERENCES
 

  1. Hendrickson,W.A. (1991) Determination of macromolecular structures from anomalous diffraction of synchrotron radiation. Science, 254, 51–58.[Abstract/Free Full Text]

  2. Wright,P.E. and Dyson,H.J. (1999) Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. J. Mol. Biol., 293, 321–331.[CrossRef][Web of Science][Medline]

  3. Liu,J., Tan,H. and Rost,B. (2002) Loopy proteins appear conserved in evolution. J. Mol. Biol., 322, 53–64.[CrossRef][Web of Science][Medline]

  4. Dunker,A.K. and Obradovic,Z. (2001) The protein trinity-linking function and disorder. Nat. Biotechnol., 19, 805–806.[CrossRef][Web of Science][Medline]

  5. Uversky,V.N., Gillespie,J.R. and Fink,A.L. (2000) Why are ‘natively unfolded’ proteins unstructured under physiologic conditions? Proteins, 41, 415–427.[CrossRef][Web of Science][Medline]

  6. Zetina,C.R. (2001) A conserved helix-unfolding motif in the naturally unfolded proteins. Proteins, 44, 479–483.[CrossRef][Web of Science][Medline]

  7. Namba,K. (2001) Roles of partly unfolded conformations in macromolecular self-assembly. Genes Cells, 6, 1–12.[Abstract]

  8. Dunker,A.K., Garner,E., Guilliot,S., Romero,P., Albrecht,K., Hart,J., Obradovic,Z., Kissinger,C. and Villafranca,J.E. (1998) Protein disorder and the evolution of molecular recognition: theory, predictions and observations. Pac. Symp. Biocomput., 473–484.

  9. Wootton,J.C. and Federhen,S. (1996) Analysis of compositionally biased regions in sequence databases. Methods Enzymol., 266, 554–571.[Web of Science][Medline]

  10. Dunker,A.K., Lawson,J.D., Brown,C.J., Williams,R.M., Romero,P., Oh,J.S., Oldfield,C.J., Campen,A.M., Ratliff,C.M., Hipps,K.W. et al. (2001) Intrinsically disordered protein. J. Mol. Graph. Model., 19, 26–59.[CrossRef][Web of Science][Medline]

  11. Romero,P., Obradovic,Z., Li,X., Garner,E.C., Brown,C.J. and Dunker,A.K. (2001) Sequence complexity of disordered protein. Proteins, 42, 38–48.[CrossRef][Web of Science][Medline]

  12. Carter,P., Liu,J. and Rost,B. (2003) PEP: predictions for entire proteomes. Nucleic Acids Res., 31, 410–413.[Abstract/Free Full Text]

  13. Etzold,T. and Argos,P. (1993) SRS–an indexing and retrieval tool for flat file data libraries. Comput. Appl. Biosci., 9, 49–57.[Abstract/Free Full Text]

  14. Liu,J. and Rost,B. (2002) Target space for structural genomics revisited. Bioinformatics, 18, 922–933.[Abstract/Free Full Text]

  15. Altschul,S.F., Madden,T.L., Schaffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 3389–3402.[Abstract/Free Full Text]

  16. Rost,B. (2001) Review: protein secondary structure prediction continues to rise. J. Struct. Biol., 134, 204–218.[Web of Science][Medline]

  17. Rost,B., Casadio,R. and Fariselli,P. (1996) Topology prediction for helical transmembrane proteins at 86% accuracy. Protein Sci., 5, 1704–1718.[Web of Science][Medline]

  18. Lupas,A. (1996) Prediction and analyis of coiled-coil structures. Methods Enzymol., 266, 513–525.[Web of Science][Medline]

  19. Rost,B. and Liu,J. (2003) The PredictProtein server. Nucleic Acids Res., 31, 3300–3304.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
J. Virol.Home page
C. Lepere-Douard, M. Trotard, J. Le Seyec, and P. Gripon
The First Transmembrane Domain of the Hepatitis B Virus Large Envelope Protein Is Crucial for Infectivity
J. Virol., November 15, 2009; 83(22): 11819 - 11829.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
S. Rotem, C. Katz, H. Benyamini, M. Lebendiker, D. Veprintsev, S. Rudiger, T. Danieli, and A. Friedler
The Structure and Interactions of the Proline-rich Domain of ASPP2
J. Biol. Chem., July 4, 2008; 283(27): 18990 - 18999.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
V. Rogemond, C. Auger, P. Giraudon, M. Becchi, N. Auvergnon, M.-F. Belin, J. Honnorat, and M. Moradi-Ameli
Processing and Nuclear Localization of CRMP2 during Brain Development Induce Neurite Outgrowth Inhibition
J. Biol. Chem., May 23, 2008; 283(21): 14751 - 14761.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
A. Jakubiec, G. Drugeon, L. Camborde, and I. Jupin
Proteolytic Processing of Turnip Yellow Mosaic Virus Replication Proteins and Functional Impact on Infectivity
J. Virol., October 15, 2007; 81(20): 11402 - 11412.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
A. Schlessinger, M. Punta, and B. Rost
Natively unstructured regions in proteins identified from contact predictions
Bioinformatics, September 15, 2007; 23(18): 2376 - 2384.
[Abstract] [Full Text] [PDF]


Home page
Plant Cell PhysiolHome page
H. Li, T. Mao, Z. Zhang, and M. Yuan
The AtMAP65-1 Cross-Bridge Between Microtubules is Formed by One Dimer
Plant Cell Physiol., June 1, 2007; 48(6): 866 - 874.
[Abstract] [Full Text] [PDF]


Home page
J. Physiol.Home page
S. H. Tsang, M. L. Woodruff, K. M. Janisch, M. C. Cilluffo, D. B. Farber, and G. L. Fain
Removal of phosphorylation sites of {gamma} subunit of phosphodiesterase 6 alters rod light response
J. Physiol., March 1, 2007; 579(2): 303 - 312.
[Abstract] [Full Text] [PDF]


Home page
Mol. Cell. ProteomicsHome page
J. Bhalla, G. B. Storchan, C. M. MacCarthy, V. N. Uversky, and O. Tcherkasskaya
Local Flexibility in Molecular Function Paradigm
Mol. Cell. Proteomics, July 1, 2006; 5(7): 1212 - 1223.
[Abstract] [Full Text] [PDF]


Home page
MicrobiologyHome page
C. Hoppner, A. Carle, D. Sivanesan, S. Hoeppner, and C. Baron
The putative lytic transglycosylase VirB1 from Brucella suis interacts with the type IV secretion system core components VirB8, VirB9 and VirB11
Microbiology, November 1, 2005; 151(11): 3469 - 3482.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. Prilusky, C. E. Felder, T. Zeev-Ben-Mordehai, E. H. Rydberg, O. Man, J. S. Beckmann, I. Silman, and J. L. Sussman
FoldIndex(C): a simple tool to predict whether a given protein sequence is intrinsically unfolded
Bioinformatics, August 15, 2005; 21(16): 3435 - 3438.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
K. Coeytaux and A. Poupon
Prediction of unfolded segments in a protein sequence based on amino acid composition
Bioinformatics, May 1, 2005; 21(9): 1891 - 1900.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Mika and B. Rost
NMPdb: Database of Nuclear Matrix Proteins
Nucleic Acids Res., January 1, 2005; 33(suppl_1): D160 - D163.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
P. Cherepanov, E. Devroe, P. A. Silver, and A. Engelman
Identification of an Evolutionarily Conserved Domain in Human Lens Epithelium-derived Growth Factor/Transcriptional Co-activator p75 (LEDGF/p75) That Binds HIV-1 Integrase
J. Biol. Chem., November 19, 2004; 279(47): 48883 - 48892.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
A. Jakubiec, J. Notaise, V. Tournier, F. Hericourt, M. A. Block, G. Drugeon, L. van Aelst, and I. Jupin
Assembly of Turnip Yellow Mosaic Virus Replication Complexes: Interaction between the Proteinase and Polymerase Domains of the Replication Proteins
J. Virol., August 1, 2004; 78(15): 7945 - 7957.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
B. Rost, G. Yachdav, and J. Liu
The PredictProtein server
Nucleic Acids Res., July 1, 2004; 32(suppl_2): W321 - W326.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
B. Rost and J. Liu
The PredictProtein server
Nucleic Acids Res., July 1, 2003; 31(13): 3300 - 3304.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (93K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (65)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Liu, J.
Right arrow Articles by Rost, B.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Liu, J.
Right arrow Articles by Rost, B.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?