| Nucleic Acids Research | Pages |
PhosphoBase: a database of phosphorylation sites
Introduction
Protein Kinase A Sequence Motifs
Data Sources
Database Format
Overall description
Description of fields
Future Versions
Access And Citation
Acknowledgements
References
PhosphoBase: a database of phosphorylation sites
ABSTRACT
INTRODUCTION
Phosphorylation of enzymes, receptors and other proteins is one of the most important signaling mechanisms in the regulation of cellular processes at the molecular level. Protein kinases catalyze the transfer of the [gamma]-phosphate of a nucleoside triphosphate (usually ATP) to an acceptor residue, usually serine, threonine or tyrosine in the substrate protein (1,2).
It has been estimated that the mammalian genome encodes >2000 different protein kinases (3). Experiments by 2D-gel electrophoresis indicate that as much as 30-50% of the proteins in a eukaryotic cell may be phosphorylatable (4). Since there may be ~10 000-30 000 different protein species in a eukaryotic cell, the average kinase probably has more than one target protein. Considering that only ~5% of the proteins in the SwissProt database (5) have phosphorylated residues in their annotation, it is obvious that only a small fraction of the active phosphorylation sites in proteins has been identified so far.
Researchers working on phosphorylation site determination will use different approaches and levels of detail when reporting the results. Various experimental methods, e.g., direct sequencing or mass spectrometry, are employed. Data will usually state that residue k in protein X is phosphorylated, but provide no kinetic data or information about the kinase in question.
As substrate specificity of protein kinases is determined mainly by the amino acid sequence in proximity of the phosphorylatable residue, many protein kinases are known to phosphorylate synthetic oligopeptides with kinetics comparable to those of the intact protein substrates (6). Peptide substrates are usually related to phosphoacceptor sites in the natural phosphoproteins. Kinetic assays with peptides of different length and mutations are valuable tools for studying substrate specificity of protein kinases. Unfortunately much of the kinetic data derived from in vitro peptide studies is not available from a single source.
We have created a framework for submission of data on specific phosphorylation sites in proteins. These may be simple positional annotations or detailed kinetic parameters. We also take into consideration that several studies examine mutated motifs of a given phosphorylation site and thereby point to the key residues in the kinase-substrate interaction. The ability of PhosphoBase to incorporate detailed information about a given phosphorylation site in terms of interactions and kinetic parameters in an easily readable format makes it a unique tool. We anticipate that PhosphoBase will be a valuable tool for molecular biologists in search of phosphorylation sites in proteins, for biochemists in planning new studies in the field of substrate specificity of protein kinases, and also for theoretical studies in making conclusions and computational predictions on the specificity of phosphorylation reactions.
PROTEIN KINASE A SEQUENCE MOTIFS
The main types of phosphorylation in eukaryotic cells are of tyrosine and serine/threonine residues. Since not all serine, threonine and tyrosine residues in a phosphoprotein are phosphorylated, kinases must display some degree of specificity. Many studies indicate that this specificity is determined by the primary sequence surrounding the phosphorylatable residue (7-9). However, most kinases studied are able to accept variations in the surrounding sequence to a smaller or larger degree and related kinases may display overlapping, yet different specificities.
To study the specificity of protein kinase A/cAMP-dependent protein kinase (PKA), we extracted all sequences from PhosphoBase annotated as being phosphorylated by PKA. Forty different sequences were aligned at the phosphorylated serine, see sequence logo in Figure 1. The basophilic nature of PKA is readily seen, as arginine and lysine dominate positions -3 and -2. This is in agreement with the consensus pattern described in the Prosite database (10), pattern PDOC0004, which has the form `[R,K][R,K].S(p)', where [R,K] means arginine or lysine, `.' means any residue and S(p) is the phosphoserine. However, 9 of the 40 sequences, corresponding to 22.5%, did not match this pattern and would thus remain undetected in a search using only the Prosite pattern.
Figure Furthermore, the PKA logo shows a tendency for basic residues at position -4 as well as hydrophobic residues at position +1 (Ile, Leu, Val). C-terminally to the phosphoserine, at positions +6 to +10, an abundance of acidic glutamate (E) residues are found. This region is most likely not recognized by the catalytic domain of PKA, but may still play a role in defining the phosphorylatable site, perhaps by making it more surface accessible. The general tendency of many kinases to display a broad range of specificity calls for more sophisticated methods for predicting the location of phosphorylation sites (N.Blom, S.Gammeltoft, J.Hansen and S.Brunak, manuscript in preparation). Analysis of different sites phosphorylated by a particular kinase is now easily possible using PhosphoBase. This will hopefully lead to prediction methods that will be able to deal with the complex consensus patterns of phosphorylation sites.

DATA SOURCES
Data was collected from SwissProt (5) and PIR (11) protein databases, literature studies and personal experiments (University of Tartu). Phosphorylation sites annotated in protein databases as `potential', `probable' or `by similarity' were not included.
DATABASE FORMAT
Overall description
Version 1.0 of PhosphoBase contains 156 entries and 398 experimentally determined phosphorylation sites.
PhosphoBase was designed to incorporate data on several levels of detail. The main part describes the latest revision dates, the name of the protein and species and cross-references to other databases. The positions of serine, threonine or tyrosine residues that have been described as phosphorylation sites are listed, followed by the actual sequence with a visual indication of the positions.
The second part of the entry presents detailed phosphorylation information. Literature references to the original phosphorylation site identification reports, information about the kinase catalyzing the phosphorylation reaction and kinetic data about peptides related to a particular phosphoprotein are listed.
All fields in the main part are required for each entry although they may be empty. In the second part of the entry most fields are optional as they may not apply to the phosphorylation site in question. The notation `//' marks the end of each entry.
Two entries are shown in Figure 2. Note that in the entry for pyruvate kinase (A005), several detailed studies on natural and mutated peptides are reported, whereas the entry for src kinase (A006) describes mainly the position of several phosphorylation sites.
Description of fields
Main section
| ACCESSION | PhosphoBase accession code (single letter + 3 digits) |
| DATE | Dates for creation and updates |
| PROT_ID | Protein name |
| SPECIES | Species name (latin and common) |
| DB_XREF | Database cross-reference to SwissProt, PIR or GenBank |
| SERINE | Position of phosphorylated serines. Parentheses indicate the peptide accession code (see PEPTIDE section) described below and is constructed from the ACCESSION code plus [A-Z] |
| THREONINE | Same as above, but for threonine |
| TYROSINE | Same as above, but for tyrosine |
Sequence section
| SEQUENCE | Number of residues followed by the actual sequence in 80 residues per line. Then follows the assignment field, where S, T and Y denote phosphorylated serine, threonine and tyrosine, respectively. A `.' means that no information about this position is provided. It does not indicate that this position is never phosphorylated. |
Phosphorylation section
| PEPTIDE | The first field contains the peptide accession number which is constructed from the ACCESSION number plus [A-Z]. The second field contains the beginning and ending positions in the natural protein and the third field contains the actual peptide sequence. The third field contains the word `natural' if the protein was examined as a whole or if details about the peptide used are unknown. |
| MUTATION | Mutations from native protein sequence (optional). In case of studies on mutated peptides, this field indicates which residues were changed, e.g., 43(S->T), meaning that serine 43 was changed to threonine. |
| EXPERIMENT | The first field contains the number of the reference pertaining to this experiment, while the second field contains the type and position of the residue being described (e.g., S-43). |
| KM | Value of kinetic constant Km (optional) |
| VMAX | Value of kinetic constant Vmax (optional)KCATValue of kinetic constant Kcat (optional) |
| KINASE | Protein kinase which phosphorylates the residue described in EXPERIMENT (optional). Common abbreviations are used. In case of ambiguity, refer to the Help section on the PhosphoBase WWW-pages (see below). |
| ASSAY | Conditions of kinetic experiments, e.g., pH, temperature, enzyme activity (optional) |
| INTERACTION | Possible interaction partners of the phosphorylated residue (e.g., SH2 or PTB domains, other kinases) (optional). |
| EXP_COMMENT | Experimental comment. Comments to indicate any other important details pertaining to EXPERIMENT (optional) |
Reference section
| REFERENCE | [N] relates to the first field described for each EXPERIMENT section. |
| COMMENT | Overall comments to indicate any other important details (optional) |
FUTURE VERSIONS
At present, PhosphoBase includes data on phosphorylated serine, threonine or tyrosine residues in eukaryotic proteins. However, phosphorylation may also occur on histidine, lysine or arginine residues (12) or may occur in prokaryotic proteins. In any case, the format proposed here will easily handle these cases if needed.
Other relevant subjects which might be incorporated in future versions include information about pseudosubstrate peptides and data on phosphatase interactions at given sites.
Figure

ACCESS AND CITATION
PhosphoBase is made publicly available on the WWW at http://www.cbs.dtu.dk/databases/PhosphoBase/ . PhosphoBase depends on the quality and use of the data provided. Therefore, we encourage people in the field of phosphorylation or related areas to submit any relevant updates, corrections or new information to PhosphoBase, which will be accordingly updated.
We encourage users of PhosphoBase to cite this paper.
ACKNOWLEDGEMENTS
We thank Kristoffer Rapacki, Hans Henrik Stærfeldt and Kristian de Lichtenberg for competent computer assistance, Jaak Järv, Mart Loog and Katrin Sak for useful suggestions about the database format. This work was supported by the Danish National Research Foundation.
REFERENCES
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals Comments and feedback: www-admin{at}oup.co.uk
Last modification: 17 Dec 1997
Copyright© Oxford University Press, 1998.
This article has been cited by other articles:
![]() |
C. R. Ingrell, M. L. Miller, O. N. Jensen, and N. Blom NetPhosYeast: prediction of protein phosphorylation sites in yeast Bioinformatics, April 1, 2007; 23(7): 895 - 897. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Baroudi, Y. Qu, O. Ramadan, M. Chahine, and M. Boutjdir Protein kinase C activation inhibits Cav1.3 calcium channel at NH2-terminal serine 81 phosphorylation site. Am J Physiol Heart Circ Physiol, October 1, 2006; 291(4): H1614 - H1622. [Abstract] [Full Text] [PDF] |
||||
![]() |
J.-M. Lin, A. Schroeder, and R. Allada In Vivo Circadian Function of Casein Kinase 2 Phosphorylation Sites in Drosophila PERIOD J. Neurosci., November 30, 2005; 25(48): 11175 - 11183. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Habran, S. Bontems, E. Di Valentin, C. Sadzot-Delvaux, and J. Piette Varicella-Zoster Virus IE63 Protein Phosphorylation by Roscovitine-sensitive Cyclin-dependent Kinases Modulates Its Cellular Localization and Activity J. Biol. Chem., August 12, 2005; 280(32): 29135 - 29143. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-D. Huang, T.-Y. Lee, S.-W. Tzeng, and J.-T. Horng KinasePhos: a web tool for identifying protein kinase-specific phosphorylation sites Nucleic Acids Res., July 1, 2005; 33(suppl_2): W226 - W229. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Loog, B. Ek, N. Oskolkov, A. Narvanen, J. Jarv, and P. Ek Screening for the Optimal Specificity Profile of Protein Kinase C Using Electrospray Mass-Spectrometry J Biomol Screen, June 1, 2005; 10(4): 320 - 328. [Abstract] [PDF] |
||||
![]() |
R. F. Walther, E. Atlas, A. Carrigan, Y. Rouleau, A. Edgecombe, L. Visentin, C. Lamprecht, G. C. Addicks, R. J. G. Hache, and Y. A. Lefebvre A Serine/Threonine-rich Motif Is One of Three Nuclear Localization Signals That Determine Unidirectional Transport of the Mineralocorticoid Receptor to the Nucleus J. Biol. Chem., April 29, 2005; 280(17): 17549 - 17561. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. H. Diks, K. Kok, T. O'Toole, D. W. Hommes, P. van Dijken, J. Joore, and M. P. Peppelenbosch Kinome Profiling for Studying Lipopolysaccharide Signal Transduction in Human Peripheral Blood Mononuclear Cells J. Biol. Chem., November 19, 2004; 279(47): 49206 - 49213. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. M. Miggin and B. T. Kinsella Investigation of the Mechanisms of G Protein: Effector Coupling by the Human and Mouse Prostacyclin Receptors. IDENTIFICATION OF CRITICAL SPECIES-DEPENDENT DIFFERENCES J. Biol. Chem., July 19, 2002; 277(30): 27053 - 27064. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Bontems, E. Di Valentin, L. Baudoux, B. Rentier, C. Sadzot-Delvaux, and J. Piette Phosphorylation of Varicella-Zoster Virus IE63 Protein by Casein Kinases Influences Its Cellular Localization and Gene Regulation Activity J. Biol. Chem., May 31, 2002; 277(23): 21050 - 21060. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. A. Lawler, S. M. Miggin, and B. T. Kinsella Protein Kinase A-mediated Phosphorylation of Serine 357 of the Mouse Prostacyclin Receptor Regulates Its Coupling to Gs-, to Gi-, and to Gq-coupled Effector Signaling J. Biol. Chem., August 31, 2001; 276(36): 33596 - 33607. [Abstract] [Full Text] [PDF] |
||||
![]() |
M.-T. Walsh, J. F. Foley, and B. T. Kinsella The alpha , but Not the beta , Isoform of the Human Thromboxane A2 Receptor Is a Target for Prostacyclin-mediated Desensitization J. Biol. Chem., June 30, 2000; 275(27): 20412 - 20423. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





