Skip Navigation

Nucleic Acids Research 2005 33(Web Server Issue):W214-W219; doi:10.1093/nar/gki385
This Article
Right arrow Abstract Freely available
Right arrow Print PDF (2729K) Freely available
Right arrow Screen PDF (470K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Bohne-Lang, A.
Right arrow Articles by von der Lieth, C.-W.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Bohne-Lang, A.
Right arrow Articles by von der Lieth, C.-W.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

© The Author 2005. Published by Oxford University Press. All rights reserved
The online version of this article has been published under an open access model. Users are entitled to use, reproduce, disseminate, or display the open access version of this article for non-commercial purposes provided that: the original authorship is properly and fully attributed; the Journal and Oxford University Press are attributed as the original place of publication with the correct citation details given; if an article is subsequently reproduced or disseminated not in its entirety but only in part or as a derivative work this must be clearly indicated. For commercial re-use, please contact journals.permissions{at}oupjournals.org


Article

GlyProt: in silico glycosylation of proteins

Andreas Bohne-Lang* and Claus-Wilhelm von der Lieth

German Cancer Research Center Heidelberg, Central Spectroscopy–Molecular Modeling Im Neuenheimer Feld 280, D-69120 Heidelberg, Germany

*To whom correspondence should be addressed. Tel: +49 6221 42 4541; Fax: +49 6221 42 3669; Email: a.bohne{at}dkfz-heidelberg.de

Received February 11, 2005. Revised March 9, 2005. Accepted March 9, 2005.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
GlyProt (http://www.glycosciences.de/glyprot/) is a web-based tool that enables meaningful N-glycan conformations to be attached to all the spatially accessible potential N-glycosylation sites of a known three-dimensional (3D) protein structure. The probabilities of physicochemical properties such as mass, accessible surface and radius of gyration are calculated. The purpose of this service is to provide rapid access to reliable 3D models of glycoproteins, which can subsequently be refined by using more elaborate simulations and validated by comparing the generated models with experimental data.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
The human genome appears to encode no more than 25 000 proteins (1). This relatively small number of genes compared with the genome of other species has been one of the big surprises to come out of the Human Genome Project. A major challenge is to understand how post-translational events affect the activities and functions of these proteins in relation to health and disease. Among these, glycosylation is by far the most frequent; more than half of all the proteins in the human body have glycan molecules attached (2,3). Glycosylated proteins are ubiquitous components of extracellular matrices and cellular surfaces. Their oligosaccharide moieties are implicated in a wide range of cell–cell and cell–matrix recognition events. N-glycans covalently connected to proteins constitute highly flexible molecules. Therefore, only a small number of glycan structures are available for which sufficient electron density for an entire oligosaccharide chain can be detected (4). Unambiguous structure determination based on NMR-derived geometric constraints alone is often not possible (5). Time-consuming computational approaches such as Monte Carlo calculations and molecular dynamics simulations have been widely used to explore the conformational space accessible to complex carbohydrates (6,7).

For reasons that are not well understood, not all Asn-X-Ser/Thr sequons are glycosylated. Unfortunately, the unambiguous determination of occupied N-glycosylation sites is experimentally demanding and can vary between different cellular locations. The aims of GlyProt are (i) to evaluate whether a potential N-glycosylation site is spatially accessible, (ii) to generate reasonable three-dimensional (3D) models of glycoproteins with user-definable glycan moieties and (iii) to provide some evidence on how the physicochemical parameters can change between the varying glycoforms of a protein.


    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
The 3D structure of a protein in Protein Data Bank (PDB) format is required as input (see dataflow given in Figure 1). The protein structure can be either taken directly from the PDB or uploaded from a local computer. Potential N-glycosylation sites (sequon: Asn-X-Ser/Thr, where X is not Pro) are automatically detected and highlighted using the one-letter amino acid code. In cases where experimental coordinates with already attached glycans are provided, the internal coordinates (distance between the N of the Asn-sidechain and the C1 of the attached ß-D-GlcpNAc and the torsion angles determining the orientation of the glycan moiety) are displayed.



View larger version (32K):
[in this window]
[in a new window]
 
Figure 1 Dataflow of GlyProt.

 
Orientation of the N-glycans
The orientation of the attached N-glycan relative to the glycosylation site is described by the four consecutive torsion angles {chi}1, {chi}2, {Phi} and {Psi} (for definition see Table 1). It is well known from the analysis of the experimentally available 3D structures of glycoproteins (8,9) that preferred orientations of the glycan moiety relative to the protein exist (Figure 2). The current version of the PDB contains nearly 3000 N-glycan chains. Conformational maps indicating the populated areas for all four torsion angles can be easily obtained using the GlyTorsion tool (http://www.glycosciences.de/glytorsion/) from the Carbohydrate Structure Suite (10).


View this table:
[in this window]
[in a new window]
 
Table 1 Definition of torsion angles defining the orientation of the glycan moiety relative to the protein and hierarchy of applied torsion angles

 


View larger version (28K):
[in this window]
[in a new window]
 
Figure 2 Statistical analysis of the PDB for torsion angles determining the orientation of the glycan moiety relative to the protein.

 
It is assumed that the Man3 N-glycan core exhibits one dominant, relatively rigid conformation. This assumption is supported by the analysis of experimentally determined torsion angles for the corresponding glycosidic linkages in the PDB (Table 2 and Figure 3). Only the 1–6 linkage exhibits two significantly populated conformations, whereas the other three linkages constitute only one highly populated conformation.


View this table:
[in this window]
[in a new window]
 
Table 2 Torsion angles for glycosidic linkages of the N-glycan core region

 


View larger version (60K):
[in this window]
[in a new window]
 
Figure 3 Statistical analysis of the PDB for glycosidic torsion angles determining the conformation of the N-glycan core.

 
To evaluate whether a potential glycosylation site is spatially accessible, a program written in C is used to connect the Man3 N-glycan core to the protein and test all possible angle sets. The frequency of occurrence of the four relevant torsion angles (Table 1) is used to orient the N-glycan core. Next, the program evaluates whether atoms of the attached glycan moiety overlap with the protein. If spatial overlaps are detected, the model is rejected and the next most frequently observed orientation of the glycan moiety is applied. Table 1 lists the values of the four relevant torsion angles and the succession in which they are applied. This procedure is repeated until a structure with no or minor overlap has been found. If all orientations listed in Table 1 have been applied and all resulting glycoprotein structures exhibit overlapping atoms, it is assumed that the glycosylation site is spatially inaccessible and therefore cannot be glycosylated.

Construction of user-definable glycoproteins
For each spatially accessible potential N-glycosylation site three options are offered for selecting the N-glycan to be connected. The user can

  1. select the type of N-glycan (e.g. oligomannose rich, complex, hybrid, very large); by default a typical structure for each class is taken;
  2. select an N-glycan from a database of >1000 structures (Figure 4) constructed using SWEET-II (11) and optimized using the TINKER MM3 force field (http://dasher.wustl.edu/tinker/); the database is searchable by N-glycan composition;
  3. construct the desired N-glycan using SWEET-II by user input of the desired structure using the extended IUPAC nomenclature.
If the coordinates provided already contain attached N-glycans, the user can either accept this orientation or use the procedure described above to align the glycan moiety.



View larger version (44K):
[in this window]
[in a new window]
 
Figure 4 Input spreadsheet (top) used to query the database, which contains >1000 3D structures of N-glycans (bottom). The user indicates the desired glycoform by checking the corresponding selection box.

 

    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
The atomic coordinates of the desired glycoprotein are given in PDB format, and they are immediately displayed using the Java applet Jmol (http://jmol.sourceforge.net/). The coordinates can be downloaded and used as input for many 3D visualization programs (see Figure 5). In addition, some physicochemical parameters for the non-glycosylated and the glycosylated protein are displayed to provide a general delineation of the changes caused by the selected glycoform (see Table 3). The program Surface Racer (12) is used to calculate the solvent accessible surface of both molecules. The generally observed increase of the polar surface area as a result of glycosylation reflects the well-known experience that glycoproteins exhibit higher solubility.



View larger version (43K):
[in this window]
[in a new window]
 
Figure 5 User interface (top) to select the desired glycoform for each gycosylation site. Visualization (bottom) of the constructed glycoprotein. The protein part is given as a cartoon representation; the glycan part as a spacefill model.

 

View this table:
[in this window]
[in a new window]
 
Table 3 Comparison of some characteristic physicochemical properties of the pure Influenza A Subtype N9 Neuraminidase (14) and the constructed glycoform

 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 
GlyProt enables rapid Internet-based access to reasonable 3D model of glycoproteins. Although it is estimated that >50% of all proteins are glycosylated (2,3), only ~5% of all PDB entries have attached glycan chains (4). Moreover, only a few entries in the PDB contain X-ray diffraction data with sufficient electron density to detect an entire oligosaccharide chain. The 3D models of glycoproteins constructed with GlyProt can provide some evidence on which areas of a protein are captured by a certain glycoform and whether, for example, a binding site is covered so that the biological activity of a protein may be influenced.

Simply because of their large size and hydrophilicity, glycans can alter the physicochemical properties of a glycoprotein, making it more soluble, reducing backbone flexibility and thus leading to increased protein stability, protecting it from proteolysis, and so on. The calculation of some characteristic physicochemical parameters will help in the evaluation and explanation of the varying properties of different glycoforms. Of the therapeutic proteins on the market, ~60% are glycoproteins (13). Often, the removal of N-glycans results in a protein with a very short half-life and virtually no activity in vivo (13).

A comprehensive evaluation of the impact of varying glycoforms on protein function is hampered by the high conformational flexibility of glycan structures. Based on the statistical analysis of experimentally known glycan conformations, GlyProt constructs a reasonable conformation out of a manifold. However, a more realistic analysis would require the complete conformational space that is accessible to a glycan at a given glycosylation site to be sacnned. Therefore, we intend to expand the GlyProt service with an option allowing the exploration of the conformational space accessible to an N-glycan, which is covalently bound to a specific glycosylation site. A similar approach has already been successfully applied to rapidly generate a representative ensemble of conformations of single N-glycan molecules (6). This algorithm is based on a comprehensive set of conformations of N-glycan fragments that were derived from molecular dynamics simulations. However, this approach would assume a protein conformation that remains unchanged through the attachment of varying glycans. In order to allow conformational changes of the protein backbone, only force-field-based, time-consuming simulation approaches such as molecular dynamics with inclusion of explicit water molecules would be appropriate.


    ACKNOWLEDGEMENTS
 
The development of GlyProt is funded by a grant from the German Research Council (Deutsche Forschungsgemeinschaft, DFG) within the digital library program. Funding to pay the Open Access publication charges for this article was provided by DFG.

Conflict of interest statement. None declared.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 REFERENCES
 

  1. International Human Genome Sequencing Consortium. (2004) Finishing the euchromatic sequence of the human genome Nature, 431, 931–945[CrossRef][Medline] .

  2. Apweiler, R., Hermjakob, H., Sharon, N. (1999) On the frequency of protein glycosylation, as deduced from analysis of the SWISS-PROT database Biochim. Biophys. Acta, 1473, 4–8[Medline] .

  3. Ben-Dor, S., Esterman, N., Rubin, E. (2004) Biases and complex patterns in the residues flanking protein N-glycosylation sites Glycobiology, 14, 95–101[Abstract/Free Full Text] .

  4. Luetteke, T., Frank, M., von der Lieth, C.W. (2004) Data mining the protein data bank: automatic detection and assignment of carbohydrate structures Carbohydr. Res., 339, 1015–1020[CrossRef][Web of Science][Medline] .

  5. Imberty, A. and Perez, S. (2000) Structure, conformation, and dynamics of bioactive oligosaccharides: theoretical approaches and experimental validations Chem. Rev., 100, 4567–4588[CrossRef][Web of Science][Medline] .

  6. Frank, M., Bohne-Lang, A., Wetter, T., von der Lieth, C.W. (2002) Rapid generation of a representative ensemble of N-glycan conformations In Silico Biol., 2, 427–439[Medline] .

  7. Woods, R.J. (1998) Computational carbohydrate chemistry: what theoretical methods can tell us Glycoconj. J., 15, 209–216[CrossRef][Web of Science][Medline] .

  8. Imberty, A. and Perez, S. (1995) Stereochemistry of the N-glycosylation sites in glycoproteins Protein Eng., 8, 699–709[Abstract/Free Full Text] .

  9. Petrescu, A.J., Milac, A.L., Petrescu, S.M., Dwek, R.A., Wormald, M.R. (2004) Statistical analysis of the protein environment of N-glycosylation sites: implications for occupancy, structure, and folding Glycobiology, 14, 103–114[Abstract/Free Full Text] .

  10. Lutteke, T., Frank, M., von der Lieth, C.W. (2005) Carbohydrate structure suite (CSS): analysis of carbohydrate 3D structures derived from the PDB Nucleic Acids Res., 33, D242–D246[Abstract/Free Full Text] .

  11. Bohne, A., Lang, E., von der Lieth, C.W. (1998) W3-SWEET: carbohhydrate modeling by internet J. Mol. Model., 4, 33–43 .

  12. Tsodikov, O.V., Record, M.T., Jr, Sergeev, Y.V. (2002) A novel computer program for fast exact calculation of accessible and molecular surface areas and average surface curvature J. Comput. Chem., 23, 600–609[CrossRef][Web of Science][Medline] .

  13. Gerngross, T.U. (2004) Advances in the production of human therapeutic proteins in yeasts and filamentous fungi Nat. Biotechnol., 22, 1409–1414[CrossRef][Web of Science][Medline] .

  14. White, C.L., Janakiraman, M.N., Laver, W.G., Philippon, C., Vasella, A., Air, G.M., Luo, M. (1995) A sialic acid-derived phosphonate analog inhibits different strains of influenza virus neuraminidase with different efficiencies J. Mol. Biol., 245, 623–634[CrossRef][Medline] .


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Proc. Natl. Acad. Sci. USAHome page
C.-C. Wang, J.-R. Chen, Y.-C. Tseng, C.-H. Hsu, Y.-F. Hung, S.-W. Chen, C.-M. Chen, K.-H. Khoo, T.-J. Cheng, Y.-S. E. Cheng, et al.
Glycans on influenza hemagglutinin affect receptor binding and immune response
PNAS, October 27, 2009; 106(43): 18137 - 18142.
[Abstract] [Full Text] [PDF]


Home page
J. Virol.Home page
R. Pantophlet, M. Wang, R. O. Aguilar-Sino, and D. R. Burton
The Human Immunodeficiency Virus Type 1 Envelope Spike of Primary Viruses Can Suppress Antibody Access to Variable Regions
J. Virol., February 15, 2009; 83(4): 1649 - 1659.
[Abstract] [Full Text] [PDF]


Home page
Proc. Natl. Acad. Sci. USAHome page
D. Shental-Bechor and Y. Levy
Effect of glycosylation on protein folding: A close look at thermodynamic stabilization
PNAS, June 17, 2008; 105(24): 8256 - 8261.
[Abstract] [Full Text] [PDF]


Home page
J. Biol. Chem.Home page
L. Beinrohr, V. Harmat, J. Dobo, Z. Lorincz, P. Gal, and P. Zavodszky
C1 Inhibitor Serpin Domain Structure Reveals the Likely Mechanism of Heparin Potentiation and Conformational Disease
J. Biol. Chem., July 20, 2007; 282(29): 21100 - 21109.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (2729K) Freely available
Right arrow Screen PDF (470K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Bohne-Lang, A.
Right arrow Articles by von der Lieth, C.-W.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Bohne-Lang, A.
Right arrow Articles by von der Lieth, C.-W.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?