Nucleic Acids Research, 2003, Vol. 31, No. 13 3701-3708
© 2003 Oxford University Press
GlobPlot: exploring protein sequences for globularity and disorder
European Molecular Biology Laboratory, Biocomputing Unit, D-69117 Heidelberg, Germany
*To whom correspondence should be addressed. Tel: +49 6221387451; Fax: +49 6221387517; Email: linding{at}embl.de
Received February 14, 2003; Revised and Accepted March 20, 2003
| ABSTRACT |
|---|
|
|
|---|
A major challenge in the proteomics and structural genomics era is to predict protein structure and function, including identification of those proteins that are partially or wholly unstructured. Non-globular sequence segments often contain short linear peptide motifs (e.g. SH3-binding sites) which are important for protein function. We present here a new tool for discovery of such unstructured, or disordered regions within proteins. GlobPlot (http://globplot.embl.de) is a web service that allows the user to plot the tendency within the query protein for order/globularity and disorder. We show examples with known proteins where it successfully identifies inter-domain segments containing linear motifs, and also apparently ordered regions that do not contain any recognised domain. GlobPlot may be useful in domain hunting efforts. The plots indicate that instances of known domains may often contain additional N- or C-terminal segments that appear ordered. Thus GlobPlot may be of use in the design of constructs corresponding to globular proteins, as needed for many biochemical studies, particularly structural biology. GlobPlot has a pipeline interfaceGlobPipefor the advanced user to do whole proteome analysis. GlobPlot can also be used as a generic infrastructure package for graphical displaying of any possible propensity.
| INTRODUCTION |
|---|
|
|
|---|
In the post-genomic era, discovery of novel domains and functional sites in proteins is of growing importance. A key part of initiatives like structural genomics is to optimise target selection by identifying domains and thereby increase spanning of fold and structure space (1). In addition, it has recently been recognised that many functionally important protein segments lie outside of globular domains in regions that are intrinsically disordered (2). Computational tools to help discern domains from intra-domain regions are key to such efforts. We describe here a graphical tool GlobPlot and a pipeline companion GlobPipe that do just this: they measure and display the propensity of protein sequences to be ordered or disordered.
There are many methods, such as SMART (Simple Modular Architecture Research Tool) (3), PRODOM (4) , Pfam (5,6), PROSITE (7) and ELM (Eukaryotic Linear Motif, http://elm.eu.org) (8), available for finding globular domains (e.g. SH3, TyrKc, active sites) and linear motifs (e.g. SH3 ligands, LXXLL nuclear receptor ligands, tyrosine phosphorylation sites, post-translational modification sites) within a protein sequence. These methods typically rely on sequence similarity models, looking for recurrence of known domains or motifs by such means as HMMs (Hidden Markov Models) (9), pattern discovery (http://www.cs.ucr.edu/~stelo/pattern.html) or SW(Smith-Waterman)-profiles (10). Although these methods are of great value in annotating protein sequences, they are limited in their ability to uncover new features not yet discovered.
A complementary approach to domain or feature discovery is to predict protein structure, though such methods are computationally intensive, error prone and are usually designed to predict structure only within globular regions.
In order to predict possible targets for further structural analysis, we present here a method complementary to structure prediction. We describe a simple, easy to use, propensity/scale based tool for exploring both potential globular and disordered/flexible regions in proteins based on their sequence.
Protein disorder can be described as the lack of regular secondary structure and a high degree of flexibility in the polypeptide chain (2). Ordered regions are often termed globular and typically contain regular secondary structures packed into a compact globule. However no general definition of disorder exists.
Disordered regions can contain functional sites, predicted as linear motifs by ELM and they are of growing interest owing to the increasing number of reports of intrinsically unstructured/disordered proteins (IUPs). IUPs contain regions that are partially or completely unfolded/unstructured in the native in vivo state of the protein. More than 100 IUPs are known (2,11), including Tau (12), Prions (13), Bcl-2 (Fig. 1) and partially p53 (14). Although little is understood about the cellular and structural meaning of this state, it is thought that it may exist as a molten globule and become ordered only when bound to another molecule (15,16). It is clear, however, that IUPs play a central role in biology and in diseases mediated by protein misfolding and aggregation (17,18).
|
Prediction of disorder can currently be performed using SEG (19), which searches for regions of low sequence complexity. However, low complexity of the sequence does not imply disorder in all cases. It is also possible to use methods such as hydrophobicity plots, though this approach is better suited to identification of segments, such as transmembrane helices, rather than finding long segments of disorder.
PONDR (http://www.pondr.com) (20,21) is a neural network based tool for disorder prediction, but it is not freely accessible.
Prot-Scale (http://us.expasy.org/cgi-bin/protscale.pl) is a general resource for showing amino acid propensity scales, using a sliding window algorithm. Prot-Scale does not offer any dedicated disorder predictor.
We discuss here GlobPlot, a tool to identify regions of globularity and disorder within protein sequences. It is a simple approach based on a running sum of the propensity for amino acids to be in an ordered or disordered state. We show that, despite its simplicity, this method is able to identify such regions when compared to domain databases and sets of disordered proteins.
| INSIDE GlobPlot |
|---|
|
|
|---|
Propensity sets
At the heart of GlobPlot are propensities, P, for all amino acids to be in globular or non-globular states. The GlobPlot package currently contains seven different propensity sets, though others could easily be added. There is no standard definition of disorder and no large set of universally agreed disordered proteins. Moreover, different parts of proteins are probably ordered under different conditions. We have thus developed a tool that allows parameters from different definitions of disorder to be applied.
We designed parameters based on the hypothesis that the tendency for disorder can be expressed as P=RC-SS where RC and SS are the propensity for a given amino acid to be in random coil and regular secondary structure, respectively. The starting point for the propensity scales were parameters for secondary structure and random coil described by Chou and Fasman (2224) and later introduced as propensities by Deleage and Roux (25). Initially, we defined a set solely based on these parameters (shown as Deleage/Roux in Fig. 2). However, we found that this scale performed poorly in finding disordered segments. Since the structure database is now much larger, we decided to recalculate propensities for amino acids to be either in regular secondary structures (
-helices or ß-strands) defined by DSSP (26) or outside of them (random coil, loops, turns etc.). We defined a non-redundant set of proteins by taking one representative from each superfamily in the SCOP database [version 1.59; http://scop.mrc-lmb.cam.ac.uk/scop/ (27,28)]. The frequencies RC and SS for each amino acid were calculated from this dataset. The resulting propensities, named Russell/Linding are given in Figure 2. Combining random coil and secondary structure in the Russell/Linding set enhanced the discrimination of the graphs and is the key factor in the success of this scale being able to detect both disorder and globular packing. GlobPlot is not intended as a competitor for secondary structure prediction. It cannot give the same level of detail as one can obtain from a secondary structure prediction based on a multiple alignment.
|
We also calculated propensities based on coordinates described as missing from the protein databank (http://www.pdb.org) (29). We considered the same set of representatives, but here we ignored domain definitions (i.e. we took the whole chain or protein). We then looked for the REMARK 465 records in the associated PDB files (restricting these to the appropriate chain when required) for residues not seen either in the electron density or the NMR structure. We presumed this set to be disordered, and all residues with C-
entries to be ordered. The resulting propensity set named REMARK465 is online but still under development. The performance of this scale in predicting disorder will be evaluated in future work. We provide a variety of different scales or propensities for the user to explore, the numerical values can be obtained from the link given in Table 1. In addition to the mentioned scales for finding disorder we also have some classical scales online for hydropathy.
|
The algorithm
The basic algorithm behind GlobPlot is simple and very fast. For each amino acid a, we have defined a propensity P(ai)
as follows:
![]() |
We run a digital low-pass filter based on Savitzky-Golay (30) over
in order to smooth the curve and get the numerical estimation of the first order derivative. The filtering is performed by an external open source C module (sav_gol) from the TISEAN 2.1 (31) Nonlinear Time Series Analysis package (http://www.mpipks-dresden.mpg.de/~tisean/). The resulting smoothed function
S is plotted using the DISLIN 8.0 package. DISLIN is distributed as platform specific binaries from http://www.linmpi.mpg.de/dislin/. The ln(i+1) term was introduced in order to balance the plot more evenly between the N and C-terminal, doing so by increasing the weight of the terms as a function of residue number. Putative globular and disorder segments are selected using a simple peak finder algorithm (referred to as PeakFinder). The peaks are chosen when the first derivative shows positive (disorder) or negative (globular) values over a continuous stretch of the minimum length given by the user as PeakFinder window length.
We opted to use a running sum function for three reasons. Firstly, it results in plots that are easy to interpret, whether by human or algorithm. Secondly, it is a simple approach to a very complex problem. Thirdly, there is no dependency on frame length as is the case for sliding window methods such as SEG or Prot-Scale.
We expect that one could construct algorithms that avoid the unbalanced weighting of the residue numbers and work more directly on the propensities; we plan to incorporate this in later versions.
| TESTING GlobPlot |
|---|
|
|
|---|
Benchmarking methods to predict disorder are hampered by both the lack of a standard definition of disorder and the lack of a quality dataset. Performance of a particular set of parameters will clearly depend on the dataset from which they are derived. The benchmarking of GlobPlot was done using the GlobPipe script collection and SQL (structured query language) based data mining (we are using PostgreSQL as relational database). We found that SQL data mining is a very efficient, fast and flexible way of performing data analysis. We benchmarked GlobPlot by a variety of approaches:
- test of disorder prediction of IUPs;
- comparison to PONDR disorder prediction;
- structural (SMART) context analysis of predicted disordered segments;
- benchmarking prediction of globular segments (GlobDoms) versus SMART;
- benchmarking prediction of disorder using structural B-factors.
GlobPlot on IUPs
The operational definition of IUPs is based on X-ray, NMR, CD (circular dichroism) and a variety of hydrodynamic volume measurements. Several IUPs are only unstructured under certain equilibrium conditions where the unfolded state is favored over the folded/structured state (32). Formation of coiled-coil dimerisation provides a well understood example of such an equilibrium. Induced folding upon binding to a target protein is also observed as in the case of the proteins CREB and CBP (33,34). This indicates that IUP-assigned protein sets are error-prone and should be carefully considered on a case by case basis. We selected proteins from a recent review by Tompa [(11), Table 1]. We applied GlobPlot to the 20 proteins listed and predicted non-globular segments in all of these proteins. The plots of these proteins can be viewed in the online GlobPlot gallery (http://globplot.embl.de/gallery/). We found that the CBP protein, rather than the CREB protein, shows significantly disordered regions (Fig. 3). In the case of FlgM it has recently been shown to be partially folded in vivo (32), the N-terminal part is correctly found by GlobPlot to be disordered. We suggest that Stathmin seems to be disordered because the interaction partner Tsg-101 is partially disordered. Finally the GlobPlot of the bovine prion protein shows clearly the N-terminal flexible tail found in the NMR study of this protein (Fig. 4).
|
|
GlobPlot versus PONDR
A comparison of GlobPlot and the neural network predictor PONDR (http://www.pondr.com) (20,21) were severely hampered by the fact that the PONDR server only allows 30 predictions. Therefore we could only test qualitatively. In general, GlobPlot and PONDR predict about the same on the disordered proteins that we tested (Table 2).
|
GlobDoms versus SMART
To determine to what extent GlobPipe can be used for isolation of putative globular domains (GlobDoms) we benchmarked against the SMART server (http://smart.embl.de). A data set of 10 497 human protein sequences was created. The following criteria were imposed on the candidate sequences:
- subset of SWISS-PROT human proteome that contains SMART hit(s) (Ivica Letunic, personal communication);
- key word filtered for fragments, putative, hypothetical and similar to;
- non-redundant data (based on EMBLs nrsp95).
The results of the structural context analysis can be seen in Figure 5. In the 10 497 proteins SMART predicted 47 340 domains, of which 25 672 were longer than 30 residues. Finally 47 989 putative globular domains (GlobDoms) were found by GlobPipe using a PeakFinder search window of length 30. GlobPlot predicts a substantial fraction of putative domains that are not known to/found by SMART/Pfam.
|
The PeakFinder module was modified to find downhill areas in the GlobPlot that were 30+ amino acids. Thirty is about the minimum size that SMART annotation will consider a globular domain (except for disulfide linked mini-domains).
Disordered segments in SMART context
We also made an attempt to look for the structural context of the disordered segments GlobPipe predicted in the above dataset. GlobPipe predicted 75 152 disordered segments using an eight residue long frame for the smoothing routine. We compared all of these segments with the domain architecture of the host sequence as predicted by SMART. We found that most flexible regions fall outside globular domains (Fig. 6). For short peptides (714 residues)
50% are nested within SMART domains, for longer segments much fewer are nested and more are overlapping. We interpret these data as an indication that GlobPlot does seem to discriminate structural features and context of the disordered segments. Another observation is that most segments are within this short range, from a functional point of view this makes sense since we know that most functional flexible/disordered sites are of length 510. This gives a basis for deploying GlobPlot as a discovery/overview tool for the annotation of functional sites.
|
Benchmarking using B-factors
Structural B-factors (isotropic temperature factors) were chosen because they are unrelated to the data we used in creating the propensity sets and because they, to a certain degree, reflect disorder and flexibility in the polypeptide chain (3537). However, B-factors vary greatly between structures and are often influenced by crystal packing and other structural artefacts. In attempt to avoid these issues only the B-factor for the C-
atom was considered. We defined a non-redundant set of proteins by taking one representative from each family in the SCOP database. The set was reduced to contain only X-ray structures of a resolution higher than 2.2 Å. The B-factor values were extracted from the PDB entries and the average B-factor and standard deviation was calculated for each chain. In order to have a stringent set, only residues that had a B-factor 3.5 standard deviations above the average were marked as disordered. The resulting data were then compared to the predictions of GlobPlot using the Russell/Linding propensity set. The seven most N- and C-terminal residues were ignored due to the lower sensitivity introduced by the Savitzky-Golay filtering.
At a specificity of 88% we obtained a sensitivity of 28%. Accounting for the low sensitivity was that 74% of the false negatives were due to high B-factor helix bundles and domains observed in enzymes (especially ligases, hydrolases and oxidoreductases) (38). We expect a putative sensitivity of 59% in this dataset.
A fundamental problem with benchmarking a disorder predictor is that currently no general definition of disorder is agreed on. Furthermore, we here describe disorder as two states (disorder/order), whereas one should expect it to be multistate. These issues makes it very difficult to select unbiased datasets for benchmarking.
We will continue the hunt for better datasets and further bench-marking to be used in disorder studies in future work. We iterate that GlobPlot successfully identifies the disorder in well char-acterised proteins like TAU, CBP, prions and PRP1_HUMAN.
| USING GlobPlot |
|---|
|
|
|---|
The GlobPlot software package consist of two parts, both implemented in the language Python 2.2 (http://www.python.org): GlobPlot and GlobPipe.
An Internet plotting serverGlobPlot
GlobPlot is a CGI (common gateway interface) based server accessible at http://globplot.embl.de, for exploring disorder and globular segments (GlobDoms).
The web interface is fairly straight forward to use, the user can paste a sequence or enter the SWISS-PROT/SWALL accession (eg. P08630 [GenBank] ) or entry code (e.g. BTLK_DROME). The GlobPlot server fetches the sequence and description of the polypeptide from an ExPASy server using Biopython.org software.
By default the server will send the sequence to the public SMART queue (http://smart.embl.de, that by default also predicts Pfam domains) and display any obtained domain predictions as colored boxes layered on the graph. The SMART/Pfam prediction substantially increases the plotting time, but is set to on by default because it is a very informative feature. Showing the boundaries of known SMART domains in the sequence is of great value for navigating as well as analysing the globplot. The SMART predictions are used solely for graphical viewing, they are not used in the GlobPlot routine itself.
In order to present a graph that is smoothed for digital noise, we use a digital low-pass filter based on Savitzky-Golay (least square fitting). The user can obtain the non-smoothed curve, as well as change the window length used by the Savitzky-Golay algorithm, however normally the default settings for the smoothing are optimal. Further information on the available user options are described in Table 3. In order to give the user the possibility for further data analysis, the numerical data for the plot can be downloaded in tabulated format from the result page and used in other plotting software such as Grace, OpenOffice.org or Excel. Because GlobPlot is scale stable the user can paste in a specific sub-sequence and obtain a zoomed plot. The output file format for the plot is PNG (portable network graphics), but publication quality plots can be created using the postscript option. Residue ranges for found disordered segments and globular regions (GlobDoms) are shown at the bottom of the output page.
|
A pipeline interfaceGlobPipe
GlobPipe is a pipeline that can be used for proteome scale analysis. The pipeline software is not a complete package but rather a set of routines that performs tasks relevant for SQL driven data mining large amounts of data in a relational database (PostgreSQL, http://www.postgresql.org). GlobPipe is still under development but it should be possible for any user with some programming skills to set up their own pipeline analysis, using these routines. We expect to set up a database of disordered protein sequences based on GlobPipe predictions.
The GlobPlot/GlobPipe package
The full GlobPlot/GlobPipe package (excluding the DISLIN and TISEAN modules that both have to be obtained from their respective websites) can be downloaded as a tarball at http://globplot.embl.de/download.html. The software is released under the Academic Free License Version 1.2 and is thereby OSI Certified Open Source Software (http://www.opensource.org). The software has broad platform coverage and is currently served on a FreeBSD box.
| GlobPlot CAN BE USED FOR ... |
|---|
|
|
|---|
- Finding regions that create trouble in your crystallisation setups.
- Searching for putative new globular domains and functional sites.
- Searching for IUPs and intrinsic disorder.
- Graphical visualisation of any propensity set that can be constructed for a polymeric sequence.
- Building a database of putative IUPs and domains using GlobPipe.
| ACKNOWLEDGEMENTS |
|---|
We thank Will Stanley, Kresten Lindorff-Larsen, Jesper Borg and David Martin for cool fruitful discussions and suggestions and Christine Gemuend and Ivica Letunic for SMART interface code/data. We are grateful to Francesca Diella, Chenna Ramu, Sophie Chabanis and Sara Quirk for reading this manuscript. This work was partly supported by EU grant QLRI-CT-2000-00127.
Finally we are deeply grateful to FreeBSD.org, (bio)Python.org, PostgreSQL.org, Debian.org and Apache.org for fantastic open free software.
| REFERENCES |
|---|
|
|
|---|
- Brenner,S. (2000) Target selection for structural genomics. Nat. Struct. Biol., 7 (suppl), 967969.[CrossRef][Medline]
- Wright,P. and Dyson,H. (1999) Intrinsically unstructured proteins: re-assessing the protein structure-function paradigm. J. Mol. Biol., 293, 321331.[CrossRef][Web of Science][Medline]
- Letunic,I., Goodstadt,L., Dickens,N., Doerks,T., Schultz,J., Mott,R., Ciccarelli,F., Copley,R., Ponting,C. and Bork,P. (2002) Recent improvements to the SMART domain-based sequence annotation resource. Nucleic Acids Res., 30, 242244.
[Abstract/Free Full Text] - Servant,F., Bru,C., Carrere,S., Courcelle,E., Gouzy,J., Peyruc,D. and Kahn,D. (2002) ProDom: automated clustering of homologous domains. Brief Bioinform., 3, 246251.
[Abstract/Free Full Text] - Mulder,N., Apweiler,R., Attwood,T., Bairoch,A., Barrell,D., Bateman,A., Binns,D., Biswas,M., Bradley,P., Bork,P. et al. (2003) The InterPro Database, 2003 brings increased coverage and new features. Nucleic Acids Res., 31, 315318.
[Abstract/Free Full Text] - Bateman,A., Birney,E., Cerruti,L., Durbin,R., Etwiller,L., Eddy,S., Griffiths-Jones,S., Howe,K., Marshall,M. and Sonnhammer,E. (2002) The Pfam protein families database. Nucleic Acids Res., 30, 276280.
[Abstract/Free Full Text] - Sigrist,C., Cerutti,L., Hulo,N., Gattiker,A., Falquet,L., Pagni,M., Bairoch,A. and Bucher,P. (2002) PROSITE: a documented database using patterns and profiles as motif descriptors. Brief Bioinform., 3, 265274.
[Abstract/Free Full Text] - Puntervoll,P., Linding,R., Gemünd,C., Chabanis-Davidson,S., Mattingsdal,M., Cameron,S., Martin,D.M.A., Ausiello,G., Brannetti,B., Costantini,A. et al. (2003) ELM server: a new resource for revealing short functional sites in modular eukaryotic proteins. Nucleic Acids Res., 31, 36253630.
[Abstract/Free Full Text] - Eddy,S. (1998) Profile hidden Markov models. Bioinformatics, 14, 755763.
[Abstract/Free Full Text] - Smith,T. and Waterman,M. (1981) Identification of common molecular subsequences. J. Mol. Biol., 147, 195197.[CrossRef][Web of Science][Medline]
- Tompa,P. (2002) Intrinsically unstructured proteins. Trends Biochem. Sci., 27, 527533.[CrossRef][Web of Science][Medline]
- Schweers,O., Schonbrunn-Hanebeck,E., Marx,A. and Mandelkow,E. (1994) Structural studies of tau protein and Alzheimer paired helical filaments show no evidence for betastructure. J. Biol. Chem., 269, 2429024297.
[Abstract/Free Full Text] - Lopez Garcia,F., Zahn,R., Riek,R. and Wuthrich,K. (2000) NMR structure of the bovine prion protein. Proc. Natl Acad. Sci. USA, 97, 83348339.
[Abstract/Free Full Text] - Kussie,P., Gorina,S., Marechal,V., Elenbaas,B., Moreau,J., Levine,A. and Pavletich,N. (1996) Structure of the MDM2 oncoprotein bound to the p53 tumor suppressor transactivation domain. Science, 274, 948953.
[Abstract/Free Full Text] - Uversky,V. (2002) Natively unfolded proteins: a point where biology waits for physics. Protein Sci., 11, 739756.[CrossRef][Web of Science][Medline]
- Dunker,A., Lawson,J., Brown,C., Williams,R., Romero,P., Oh,J., Oldfield,C., Campen,A., Ratliff,C., Hipps,K. et al. (2001) Intrinsically disordered protein. J. Mol. Graph. Model., 19, 2659.[CrossRef][Web of Science][Medline]
- Dunker,A., Brown,C., Lawson,J., Iakoucheva,L. and Obradovic,Z. (2002) Intrinsic disorder and protein function. Biochemistry, 41, 65736582.[CrossRef][Medline]
- Dunker,A., Garner,E., Guilliot,S., Romero,P., Albrecht,K., Hart,J., Obradovic,Z., Kissinger,C. and Villafranca,J. (1998) Protein disorder and the evolution of molecular recognition: theory, predictions and observations. Pac. Symp. Biocomput., 473484.
- Wootton,J. (1994) Non-globular domains in protein sequences: automated segmentation using complexity measures. Comput. Chem., 18, 269285.[CrossRef][Web of Science][Medline]
- Garner,E., Cannon,P., Romero,P., Obradovic,Z. and Dunker,A. (1998) Predicting disordered regions from amino acid sequence: common themes despite differing structural characterization. Genome Inform Ser Workshop Genome Inform, 9, 201213.[Medline]
- Garner,E., Romero,P., Dunker,A., Brown,C. and Obradovic,Z. (1999) Predicting binding regions within disordered proteins. Genome Inform SerWorkshop Genome Inform, 10, 4150.
- Chou,P. and Fasman,G. (1974) Conformational parameters for amino acids in helical, betasheet, and random coil regions calculated from proteins. Biochemistry, 13, 211222.[CrossRef][Medline]
- Chou,P. and Fasman,G. (1978) Prediction of the secondary structure of proteins from their amino acid sequence. Adv. Enzymol. Relat. Areas Mol. Biol., 47, 45148.[Medline]
- Chou,P. and Fasman,G. (1978) Empirical predictions of protein conformation. Annu. Rev. Biochem., 47, 251276.[CrossRef][Web of Science][Medline]
- Deleage,G. and Roux,B. (1987) An algorithm for protein secondary structure prediction based on class prediction. Protein Eng., 1, 289294.
[Abstract/Free Full Text] - Kabsch,W. and Sander,C. (1983) Dictionary of protein secondary structure: pattern recognition of hydrogen-bonded and geometrical features. Biopolymers, 22, 25772637.[CrossRef][Web of Science][Medline]
- Lo Conte,L., Brenner,S., Hubbard,T., Chothia,C. and Murzin,A. (2002) SCOP database in 2002: refinements accommodate structural genomics. Nucleic Acids Res., 30, 264267.
[Abstract/Free Full Text] - Chandonia,J., Walker,N., Lo Conte,L., Koehl,P., Levitt,M. and Brenner,S. (2002) ASTRAL compendium enhancements. Nucleic Acids Res., 30, 260263.
[Abstract/Free Full Text] - Westbrook,J., Feng,Z., Chen,L., Yang,H. and Berman,H. (2003) The Protein Data Bank and structural genomics. Nucleic Acids Res., 31, 489491.
[Abstract/Free Full Text] - Press,W.H. and Teukolsky,S.A. (2002) Numerical recipes. In Press, W.H. C++ The Art of Scientific Computing, 2nd Edn. Cambridge University Press.
- Hegger,R., Kantz,H. and Schreiber,T. (1999) Practical implementation of nonlinear time series methods: the TISEAN package. Chaos, 9, 413.[CrossRef][Web of Science][Medline]
- Dedmon,M., Patel,C., Young,G. and Pielak,G. (2002) FlgM gains structure in living cells. Proc. Natl Acad. Sci. USA, 99, 1268112684.
[Abstract/Free Full Text] - Radhakrishnan,I., Perez-Alvarado,G., Parker,D., Dyson,H., Montminy,M. and Wright,P. (1997) Solution structure of the KIX domain of CBP bound to the transactivation domain of CREB: a model for activator:coactivator interactions. Cell, 91, 741752.[CrossRef][Web of Science][Medline]
- Demarest,S., Martinez-Yamout,M., Chung,J., Chen,H., Xu,W., Dyson,H., Evans,R. and Wright,P. (2002) Mutual synergistic folding in recruitment of CBP/p300 by p160 nuclear receptor coactivators. Nature, 415, 549553.[CrossRef][Medline]
- Parthasarathy,S. and Murthy,M. (2000) Protein thermal stability: insights from atomic displacement parameters (B values). Protein Eng., 13, 913.
[Abstract/Free Full Text] - Wampler,J. (1997) Distribution analysis of the variation of B-factors of X-ray crystal structures; temperature and structural variations in lysozyme. J. Chem. Inf. Comput. Sci., 37, 11711180.[CrossRef][Web of Science][Medline]
- Zoete,V., Michielin,O. and Karplus,M. (2002) Relation between sequence and structure of HIV-1 protease inhibitor complexes: a model system for the analysis of protein flexibility. J. Mol. Biol., 315, 2152.[CrossRef][Web of Science][Medline]
- Rudino-Pinera,E., Morales-Arrieta,S., Rojas-Trejo,S. and Horjales,E. (2002) Structural flexibility, an essential component of the allosteric activation in Escherichia coli glucosamine-6-phosphate deaminase. Acta Crystallogr. D Biol. Crystallogr., 58, 1020.[CrossRef][Medline]
This article has been cited by other articles:
![]() |
J. A. Encinar, G. Fernandez-Ballester, I. E. Sanchez, E. Hurtado-Gomez, F. Stricher, P. Beltrao, and L. Serrano ADAN: a database for prediction of protein-protein interaction of modular domains mediated by linear motifs Bioinformatics, September 15, 2009; 25(18): 2418 - 2424. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Van Doorslaer, A. Ould M'hamed Ould Sidi, K. Zanier, V. Rybin, F. Deryckere, A. Rector, R. D. Burk, E. K. Lienau, M. van Ranst, and G. Trave Identification of Unusual E6 and E7 Proteins within Avian Papillomaviruses: Cellular Localization, Biophysical Characterization, and Phylogenetic Analysis J. Virol., September 1, 2009; 83(17): 8759 - 8770. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. S. H. Tan, B. Bodenmiller, A. Pasculescu, M. Jovanovic, M. O. Hengartner, C. Jorgensen, G. D. Bader, R. Aebersold, T. Pawson, and R. Linding Comparative Analysis Reveals Conserved Protein Phosphorylation Networks Implicated in Multiple Diseases Sci. Signal., July 28, 2009; 2(81): ra39 - ra39. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. T. M. Mooij, E. Mitsiki, and A. Perrakis ProteinCCD: enabling the design of protein truncation constructs for expression and crystallization experiments Nucleic Acids Res., July 1, 2009; 37(suppl_2): W402 - W405. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Milbradt, S. Auerochs, H. Sticht, and M. Marschall Cytomegaloviral proteins that associate with the nuclear lamina: components of a postulated nuclear egress complex J. Gen. Virol., March 1, 2009; 90(3): 579 - 590. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Yin, F. Bangs, I. R. Paton, A. Prescott, J. James, M. G. Davey, P. Whitley, G. Genikhovich, U. Technau, D. W. Burt, et al. The Talpid3 gene (KIAA0586) encodes a centrosomal protein that is essential for primary cilia formation Development, February 15, 2009; 136(4): 655 - 664. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. L. Miller, L. J. Jensen, F. Diella, C. Jorgensen, M. Tinti, L. Li, M. Hsiung, S. A. Parker, J. Bordeaux, T. Sicheritz-Ponten, et al. Linear Motif Atlas for Phosphorylation-Dependent Signaling Sci. Signal., September 2, 2008; 1(35): ra2 - ra2. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Rotem, C. Katz, H. Benyamini, M. Lebendiker, D. Veprintsev, S. Rudiger, T. Danieli, and A. Friedler The Structure and Interactions of the Proline-rich Domain of ASPP2 J. Biol. Chem., July 4, 2008; 283(27): 18990 - 18999. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. M. Overton, C. A. J. van Niekerk, L. G. Carter, A. Dawson, D. M. A. Martin, S. Cameron, S. A. McMahon, M. F. White, W. N. Hunter, J. H. Naismith, et al. TarO: a target optimisation system for structural biology Nucleic Acids Res., July 1, 2008; 36(suppl_2): W190 - W196. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. W. Brandt, J. Heringa, and J. A. M. Leunissen SEQATOMS: a web tool for identifying missing regions in PDB in sequence context Nucleic Acids Res., July 1, 2008; 36(suppl_2): W255 - W259. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. P. Fagan, M. A. Lambert, and S. G. J. Smith The Hek Outer Membrane Protein of Escherichia coli Strain RS218 Binds to Proteoglycan and Utilizes a Single Extracellular Loop for Adherence, Invasion, and Autoaggregation Infect. Immun., March 1, 2008; 76(3): 1135 - 1142. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. V. Tetko, I. V. Rodchenkov, M. C. Walter, T. Rattei, and H.-W. Mewes Beyond the 'best' match: machine learning annotation of protein sequences by integration of different sources of information Bioinformatics, March 1, 2008; 24(5): 621 - 628. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Michael, G. Trave, C. Ramu, C. Chica, and T. J. Gibson Discovery of candidate KEN-box motifs using Cell Cycle keyword enrichment combined with native disorder prediction and motif conservation Bioinformatics, February 15, 2008; 24(4): 453 - 457. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Hou, R. Liu, S. Ross, E. J. Smart, H. Zhu, and W. Gong Crystallographic Studies of Human MitoNEET J. Biol. Chem., November 16, 2007; 282(46): 33242 - 33246. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Schlessinger, M. Punta, and B. Rost Natively unstructured regions in proteins identified from contact predictions Bioinformatics, September 15, 2007; 23(18): 2376 - 2384. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. P. John, T. Wang, S. Steffen, S. Longhi, C. S. Schmaljohn, and C. B. Jonsson Ebola Virus VP30 Is an RNA Binding Protein J. Virol., September 1, 2007; 81(17): 8967 - 8976. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Hirose, K. Shimizu, S. Kanai, Y. Kuroda, and T. Noguchi POODLE-L: a two-level SVM prediction system for reliably predicting long disordered regions Bioinformatics, August 15, 2007; 23(16): 2046 - 2053. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Ishijima, N. Nagasaki, M. Maeshima, and M. Miyano RVCaB, a Calcium-binding Protein in Radish Vacuoles, is Predominantly an Unstructured Protein with a Polyproline Type II Helix J. Biochem., August 1, 2007; 142(2): 201 - 211. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Watanabe, M. Shionyu, T. Kimura, K. Kimata, and H. Watanabe Splicing Factor 3b Subunit 4 Binds BMPR-IA and Inhibits Osteochondral Cell Differentiation J. Biol. Chem., July 13, 2007; 282(28): 20728 - 20738. [Abstract] [Full Text] [PDF] |
||||
![]() |
C.-T. Su, C.-Y. Chen, and C.-M. Hsu iPDA: integrated protein disorder analyzer Nucleic Acids Res., July 13, 2007; 35(suppl_2): W465 - W472. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Cheng DOMAC: an accurate, hybrid protein domain prediction server Nucleic Acids Res., July 13, 2007; 35(suppl_2): W354 - W356. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Fuxreiter, P. Tompa, and I. Simon Local structural disorder imparts plasticity on linear motifs Bioinformatics, April 15, 2007; 23(8): 950 - 956. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Thusberg and M. Vihinen The structural basis of hyper IgM deficiency - CD40L mutations Protein Eng. Des. Sel., March 1, 2007; 20(3): 133 - 141. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Min, K. H. Kirsch, Y. Zhao, S. Jeay, A. H. Palamakumbura, P. C. Trackman, and G. E. Sonenshein The Tumor Suppressor Activity of the Lysyl Oxidase Propeptide Reverses the Invasive Phenotype of Her-2/neu-Driven Breast Cancer Cancer Res., February 1, 2007; 67(3): 1105 - 1112. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Ragni, A. Coluccio, E. Rolli, J. M. Rodriguez-Pena, G. Colasante, J. Arroyo, A. M. Neiman, and L. Popolo GAS2 and GAS4, a Pair of Developmentally Regulated Genes Required for Spore Wall Assembly in Saccharomyces cerevisiae Eukaryot. Cell, February 1, 2007; 6(2): 302 - 316. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Neduva and R. B. Russell Proline-Rich Regions in Transcriptional Complexes: Heading in Many Directions Sci. Signal., January 16, 2007; 2007(369): pe1 - pe1. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Kedzierska, L.-Y. Lian, and F. Hayes Toxin-antitoxin regulation: bimodal interaction of YefM-YoeB with paired DNA palindromes exerts transcriptional autorepression Nucleic Acids Res., January 12, 2007; 35(1): 325 - 339. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. M. Nooh, R. K. Aziz, M. Kotb, A. Eroshkin, W.-J. Chuang, T. Proft, and R. Kansal Streptococcal Mitogenic Exotoxin, SmeZ, Is the Most Susceptible M1T1 Streptococcal Superantigen to Degradation by the Streptococcal Cysteine Protease, SpeB J. Biol. Chem., November 17, 2006; 281(46): 35281 - 35288. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. D. Shaffer, G. Cenci, B. Thompson, G. E. Stephens, E. E. Slawson, K. Adu-Wusu, M. Gatti, and S. C. R. Elgin The Large Isoform of Drosophila melanogaster Heterochromatin Protein 2 Plays a Critical Role in Gene Silencing and Chromosome Structure Genetics, November 1, 2006; 174(3): 1189 - 1204. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Bottcher, B. G. Klupp, H. Granzow, W. Fuchs, K. Michael, and T. C. Mettenleiter Identification of a 709-Amino-Acid Internal Nonessential Region within the Essential Conserved Tegument Protein (p)UL36 of Pseudorabies Virus J. Virol., October 1, 2006; 80(19): 9910 - 9915. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Matsumura, K. Izui, and K. Mizuguchi A novel mechanism of allosteric regulation of archaeal phosphoenolpyruvate carboxylase: a combined approach to structure-based alignment and model assessment Protein Eng. Des. Sel., September 1, 2006; 19(9): 409 - 419. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Chen, W. Wang, S. Ling, C. Jia, and F. Wang KemaDom: a web server for domain prediction using kernel machine with local context. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W158 - W163. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Vullo, O. Bortolami, G. Pollastri, and S. C. E. Tosatto Spritz: a server for the prediction of intrinsically disordered regions in protein sequences using kernel machines. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W164 - W168. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Neduva and R. B. Russell DILIMOT: discovery of linear motifs in proteins. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W350 - W355. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. J. Li, M. Kawazaki, K. Ogasahara, and A. Nakagawa The Intracellular Region of ClC-3 Chloride Channel Is in a Partially Folded State and a Monomer. J. Biochem., May 1, 2006; 139(5): 813 - 820. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Shojania and J. D. O'Neil HIV-1 Tat Is a Natively Unfolded Protein: THE SOLUTION CONFORMATION AND DYNAMICS OF REDUCED HIV-1 Tat-(1-72) BY NMR SPECTROSCOPY J. Biol. Chem., March 31, 2006; 281(13): 8347 - 8356. [Abstract] [Full Text] [PDF] |
||||
![]() |
F.-C. Chen, S.-S. Wang, C.-J. Chen, W.-H. Li, and T.-J. Chuang Alternatively and Constitutively Spliced Exons Are Subject to Different Evolutionary Forces Mol. Biol. Evol., March 1, 2006; 23(3): 675 - 682. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. E. Gewehr and R. Zimmer SSEP-Domain: protein domain prediction by alignment of secondary structure elements and profiles Bioinformatics, January 15, 2006; 22(2): 181 - 187. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. T. Llorente, B. Garcia-Barreno, M. Calero, E. Camafeita, J. A. Lopez, S. Longhi, F. Ferron, P. F. Varela, and J. A. Melero Structural analysis of the human respiratory syncytial virus phosphoprotein: characterization of an {alpha}-helical domain involved in oligomerization J. Gen. Virol., January 1, 2006; 87(1): 159 - 169. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Hopitzan, A. J. Baines, and E. Kordeli Molecular Evolution of Ankyrin: Gain of Function in Vertebrates by Acquisition of an Obscurin/Titin-Binding-Related Domain Mol. Biol. Evol., January 1, 2006; 23(1): 46 - 55. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. J. Wright, S. C. Smith, V. Joardar, S. Scherer, J. Jervis, A. Warren, R. F. Helm, and M. Potts UV Irradiation and Desiccation Modulate the Three-dimensional Extracellular Matrix of Nostoc commune (Cyanobacteria) J. Biol. Chem., December 2, 2005; 280(48): 40271 - 40281. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Bao, W. Zhang, R. Krencik, H. Deng, Y. Wang, J. Girton, J. Johansen, and K. M. Johansen The JIL-1 kinase interacts with lamin Dm0 and regulates nuclear lamina morphology of Drosophila nurse cells J. Cell Sci., November 1, 2005; 118(21): 5079 - 5087. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. N. Khan and P. N. Lewis Unstructured Conformations Are a Substrate Requirement for the Sir2 Family of NAD-dependent Protein Deacetylases J. Biol. Chem., October 28, 2005; 280(43): 36073 - 36078. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Gerloff, A. Creasey, S. Maslau, and R. Carter Structural models for the protein family characterized by gamete surface protein Pfs230 of Plasmodium falciparum PNAS, September 20, 2005; 102(38): 13598 - 13603. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. R. Yang, R. Thomson, P. McNeil, and R. M. Esnouf RONN: the bio-basis function neural network technique applied to the detection of natively disordered regions in proteins Bioinformatics, August 15, 2005; 21(16): 3369 - 3376. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Dosztanyi, V. Csizmok, P. Tompa, and I. Simon IUPred: web server for the prediction of intrinsically unstructured regions of proteins based on estimated energy content Bioinformatics, August 15, 2005; 21(16): 3433 - 3434. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Prilusky, C. E. Felder, T. Zeev-Ben-Mordehai, E. H. Rydberg, O. Man, J. S. Beckmann, I. Silman, and J. L. Sussman FoldIndex(C): a simple tool to predict whether a given protein sequence is intrinsically unfolded Bioinformatics, August 15, 2005; 21(16): 3435 - 3438. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Jaroszewski, L. Rychlewski, Z. Li, W. Li, and A. Godzik FFAS03: a server for profile-profile sequence alignments Nucleic Acids Res., July 1, 2005; 33(suppl_2): W284 - W288. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Li and I. S. Nathke Tumor-Associated NH2-Terminal Fragments Are the Most Stable Part of the Adenomatous Polyposis Coli Protein and Can Be Regulated by Interactions with COOH-Terminal Domains Cancer Res., June 15, 2005; 65(12): 5195 - 5204. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. K. Saini and D. Fischer Meta-DP: domain prediction meta-server Bioinformatics, June 15, 2005; 21(12): 2917 - 2920. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Coeytaux and A. Poupon Prediction of unfolded segments in a protein sequence based on amino acid composition Bioinformatics, May 1, 2005; 21(9): 1891 - 1900. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Ginalski, N. V. Grishin, A. Godzik, and L. Rychlewski Practical lessons from protein structure prediction Nucleic Acids Res., April 1, 2005; 33(6): 1874 - 1891. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Ferron, C. Rancurel, S. Longhi, C. Cambillau, B. Henrissat, and B. Canard VaZyMolO: a tool to define and classify modularity in viral proteins J. Gen. Virol., March 1, 2005; 86(3): 743 - 749. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Vucetic, Z. Obradovic, V. Vacic, P. Radivojac, K. Peng, L. M. Iakoucheva, M. S. Cortese, J. D. Lawson, C. J. Brown, J. G. Sikes, et al. DisProt: a database of protein disorder Bioinformatics, January 1, 2005; 21(1): 137 - 140. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. G. Han, J. R. Smiley, C. Thomas, and L. J. Saif Genetic Recombination between Two Genotypes of Genogroup III Bovine Noroviruses (BoNVs) and Capsid Sequence Diversity among BoNVs and Nebraska-Like Bovine Enteric Caliciviruses J. Clin. Microbiol., November 1, 2004; 42(11): 5214 - 5224. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. L. Middleton, J. L. Parker, D. J. Richard, M. F. White, and C. S. Bond Substrate recognition and catalysis by the Holliday junction resolving enzyme Hje Nucleic Acids Res., October 12, 2004; 32(18): 5442 - 5451. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. G. Diaz, T. Moldoveanu, M. J. Kuiper, R. L. Campbell, and P. L. Davies Insertion Sequence 1 of Muscle-specific Calpain, p94, Acts as an Internal Propeptide J. Biol. Chem., June 25, 2004; 279(26): 27656 - 27666. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Puntervoll, R. Linding, C. Gemund, S. Chabanis-Davidson, M. Mattingsdal, S. Cameron, D. M. A. Martin, G. Ausiello, B. Brannetti, A. Costantini, et al. ELM server: a new resource for investigating short functional sites in modular eukaryotic proteins Nucleic Acids Res., July 1, 2003; 31(13): 3625 - 3630. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||























