| Nucleic Acids Research | Pages |
MHCPEP, a database of MHC-binding peptides: update 1997
Introduction
Description
Entry ID
MHC molecule
Method
Activity
Binding
Source
DB reference
Anchor positions
References
Comment
Summary
Sequence
Accuracy And Completeness Of The Data
Database Access
Further Developments
Acknowledgement
References
MHCPEP, a database of MHC-binding peptides: update 1997
ABSTRACT
INTRODUCTION
MHCPEP, a database of peptides that bind to MHC class I or II molecules was established in 1994 (1). Peptide binding to an MHC binding site is a prerequisite for T-cell recognition of peptide, although not all binding peptides function as T-cell epitopes. Allele-specific motifs for peptides binding MHC class I are now well documented (2). Although motifs for peptides binding MHC class II have been reported (2-5), they are generally less well defined. The comprehensive compilation of binding peptides in MHCPEP enables the analysis of sequence properties governing binding to MHC molecules and the identification of T-cell epitopes. It should also facilitate research on antigen processing and transport, the mechanisms of T-cell receptor activation and the development of specific approaches to immunotherapy. More than 3000 new entries have been added since the last report (6).
DESCRIPTION
Version 1.3 of MHCPEP has 13 423 entries (as of September 1997) compiled from published sources or directly submitted experimental data. A few peptides binding non-classical MHC molecules (e.g., mouse Qa-2a) are included. The majority of entries contain human or mouse MHC binding peptides. There are also a small number of entries containing peptides that bind rat, rhesus macaque, chimpanzee or goat MHC. Each report of a peptide sequence is assigned to a separate entry identified by a unique entry ID. Entries consist of 12 fields composed of the field name followed by a colon (:) delimiter and the field value. Field values may be textual, numeric or empty. Fields are written one to a row and delimited by a hash (#). Entries are delimited by ellipses (...). Representative entries are shown in Figure 1. A description of each entry field follows.
Figure
A unique identifier exists for each entry. The format is: [>organism][class][xxxx]. The `organism' is designated by a three letter code (e.g. HUM for human, MUS for mouse); `class' is a single digit number; the right side is an unique four digit hexadecimal number.
The designation of the MHC molecule, according to the nomenclature of Klein et al. (7), is followed by the specific allele (where known) within brackets. Human alleles are designated according to the DNA sequence nomenclature (8). This field also shows MHC class and host organism. The format is: [MHC molecule], [MHC class], [(host)].
Peptide binding to MHC molecules can be determined indirectly by T-cell activity based assays or directly by biochemical methods. T-cell recognition of MHC class I-bound peptides is usually detected by cytotoxicity assay, and of MHC class II-bound peptides by proliferation assay. Biochemical methods include stabilization assays, competitive inhibition assays to purified MHC molecules or cells bearing MHC, or elution followed by sequencing.
The `activity' of a peptide is a semi-quantitative measure of its immunogenic `potency'. For an MHC class I-bound peptide, `activity' is a measure of the extent of lysis by cytotoxic T-cells of target cells displaying the MHC class I-peptide complexes. A peptide is considered immunogenic if it mediates killing of at least 15% of the cells that display it. The `activity' is expressed as the PD50, (PD, peptide dose), the concentration of peptide giving 50% of maximum specific lysis, and is given a descriptive value of none, little, moderate, high, immunogenic-not-quantified or unknown. A PD50 > 10 µM is considered non-immunogenic and assigned `none' as the value of the field `ACTIVITY'. For an MHC class II-bound peptide, `activity' is a measure of the extent of T-cell proliferation induced by cells displaying the MHC class II-peptide complexes. Again `activity' is expressed as PD50, now defined as the concentration of peptide giving 50% of maximum proliferation. The range of values of the field `ACTIVITY' is given in Table 1. Table
It is assumed that all `active' peptides also bind; if no measure of binding is reported for an `active' peptide, the value `yes, ?' is assigned to the field `BINDING'. As several different methods exist for determining binding affinity, only a descriptive value is assigned to the data; the user should consult the original source for more specific details. The same scale as for `activity' is used (none, little, moderate, high, unknown).
MHC binding peptides are fragments of larger proteins. This field indicates the parent protein with the start and end positions of the fragment. Synthetic peptides (e.g., those generated by mutations of a naturally occurring sequence) are designated by the word `homologue'.
This field specifies the name of the source protein(s) as it appears in the major protein databases: SWISS-PROT (9) and PIR (10), versions 34.0 and 53.0, respectively. These databases have been searched for sequences that match the MHC peptide entry sequence. This field may spread over several lines of text. The continuation of the field is designated by an ampersand (&) as the first character in the line.
Presumed anchor residues are numbered relative to the N-terminus of the peptide sequence. The main criterion for determining the values for this field is conformance to proposed binding motifs.
A separate list of references to the published sources of entry sequences is supplied with the database. The value of the ` This field is reserved for any relevant comments or observations.
The summary field is a one-line description of the main fields of an entry (MHC molecule, activity, binding and peptide sequence), which is useful for rapid indexing of the database.
The sequence of the peptide is the one actually reported, not the minimum or optimum sequence. Therefore, a given T-cell epitope may be found within several entries representing different sequences which overlap or include it. The value for this field has an asterisk (*) following the C-terminal residue. The letter `X' is used to represent ambiguity or an unknown residue. In some instances it is not possible to distinguish certain amino acids, e.g., tandem mass spectrometry does not distinguish leucine (L) and isoleucine (I). In such cases, separate entries are created and the ambiguity is noted in the `COMMENT' field.

Entry ID
MHC molecule
Method
Activity
PD50
Value
> 10 µM
none
10 µM-100 nM
yes, little
100 nM-1 nM
yes, moderate
< 1 nM
yes, high
Immunogenic but unknown
yes, ?
Immunogenicity unknown
?
Binding
Source
DB reference
Anchor positions
References
Comment
Summary
Sequence
ACCURACY AND COMPLETENESS OF THE DATA
MHCPEP is largely compiled from published reports. However, numerous potential sources of error exist. Double-checking, comparison with original papers, comparison with other databases and multiple entry of the same sequence from different sources have been used to minimise errors. Some entries describing the same peptide may have different values in fields other than `SEQUENCE' when derived from independent sources. Differences between T-cell clones used in experiments have not been considered; in cases where an MHC-bound peptide is recognised by any of the clones it is entered into the database. Observations regarding clones which do or do not recognise the peptide are included in the comment field. The database has a degree of redundancy reflecting the variety of ways of detecting MHC-bound peptides. Earlier reports of MHC binding peptides were less specific: reported peptides were usually longer than the optimum size and the fine specificity of MHC molecules was not determined. The quality of data in recent reports has improved.
DATABASE ACCESS
MHCPEP is accessible via Internet using WWW or FTP to the following respective WEHI addresses: http://wehih.wehi.edu.au/mhcpep/ ; ftp.wehi.edu.au/pub/biology/mhcpep/ ; Gopher access option is not longer available.
MHCPEP has been linked with SWISS-PROT and PIR databases and also with references file via the `DB
Figure
Authors who wish to cite MHCPEP should quote this paper as the reference. For queries and comments regarding the MHCPEP database contact Vladimir Brusic (preferably by electronic mail) at the address given at the start of this paper.

FURTHER DEVELOPMENTS
A summary of MHCPEP contents is given in Table 2. The growing numbers of peptides known to bind to a specific MHC molecule facilitates building of predictive models for determination of novel T-cell epitope candidate peptides. Predictive models utilizing MHCPEP have been successfully applied in cancer (13) and autoimmunity (14,15) research.
Table
HOST/allelic region
Number of entries
MHC Class I
MHC Class II
Human
4617
5394
/HLA-A
2530
-
/HLA-B
1894
-
/HLA-C
64
-
/HLA-DR
-
4545
/HLA-DP
-
106
/HLA-DQ
-
550
Mouse
1145
2213
/H-2K
682
-
/H-2D
266
-
/H-2L
120
-
/H-2A
-
1252
/H-2E
-
666
Rat
15
8
Chimpanzee
13
2
Rhesus Macaque
9
5
Goat
0
2
Knowledge of MHC-peptide interactions continues to expand rapidly, together with the number of methods for determining binding peptides. Combining data generated by diverse sources imposes additional standardization requirements on the further developments. The main considerations in the improvement of MHCPEP include: (i) retrieval of data, (ii) internal data cleansing, (iii) linkage to other databases containing MHC- or antigen-related data, and (iv) extraction of high-level relationships hidden within data. Optimizing these requires a more complex structure than the current MHCPEP. We are investigating the utility of a structure that integrates a knowledge base (KB), a WWW interface, and a set of computational tools similar to the RIBOWEB system (16). A KB comprising structured hierarchical representations of MHC data, methods and literature sources, is currently being developed. This set of computational tools should facilitate further applications of the database.
ACKNOWLEDGEMENT
We thank Russ Altman of Stanford University for helpful suggestions and guidance about Knowledge Base development.
REFERENCES
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals Comments and feedback: www-admin{at}oup.co.uk
Last modification: 17 Dec 1997
Copyright© Oxford University Press, 1998.
This article has been cited by other articles:
![]() |
T. Vider-Shalit, R. Sarid, K. Maman, L. Tsaban, R. Levi, and Y. Louzoun Viruses selectively mutate their CD8+ T-cell epitopes--a large-scale immunomic analysis Bioinformatics, June 15, 2009; 25(12): i39 - i44. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Y. Chang and E. R. Unanue Prediction of HLA-DQ8 {beta} cell peptidome using a computational program and its relationship to autoreactive T cells Int. Immunol., June 1, 2009; 21(6): 705 - 713. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Pappalardo, M. D. Halling-Brown, N. Rapin, P. Zhang, D. Alemani, A. Emerson, P. Paci, P. Duroux, M. Pennisi, A. Palladini, et al. ImmunoGrid, an integrative environment for large-scale simulation of the immune system for vaccine discovery, design and optimization Brief Bioinform, May 1, 2009; 10(3): 330 - 340. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Vider-Shalit, V. Fishbain, S. Raffaeli, and Y. Louzoun Phase-Dependent Immune Evasion of Herpesviruses J. Virol., September 1, 2007; 81(17): 9536 - 9545. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q.-S. Du, Y.-T. Wei, Z.-W. Pang, K.-C. Chou, and R.-B. Huang Predicting the affinity of epitope-peptides with class I MHC molecule HLA-A*0201: an application of amino acid-based peptide prediction Protein Eng. Des. Sel., September 1, 2007; 20(9): 417 - 423. [Abstract] [Full Text] [PDF] |
||||
![]() |
C.-W. Tung and S.-Y. Ho POPI: predicting immunogenicity of MHC class I binding peptides by mining informative physicochemical properties Bioinformatics, April 15, 2007; 23(8): 942 - 949. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Zhu, K. Udaka, J. Sidney, A. Sette, K. F. Aoki-Kinoshita, and H. Mamitsuka Improving MHC binding peptide prediction by incorporating binding data of auxiliary MHC molecules Bioinformatics, July 1, 2006; 22(13): 1648 - 1655. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Mitaksov and D. H. Fremont Structural Definition of the H-2Kd Peptide-binding Motif J. Biol. Chem., April 14, 2006; 281(15): 10618 - 10625. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. A. Reche, H. Zhang, J.-P. Glutting, and E. L. Reinherz EPIMHC: a curated database of MHC-binding peptides for customized computational vaccinology Bioinformatics, May 1, 2005; 21(9): 2140 - 2141. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Ogata, A. Jaramillo, W. Cohen, J.-P. Briand, F. Connan, J. Choppin, S. Muller, and S. J. Wodak Automatic Sequence Design of Major Histocompatibility Complex Class I Binding Peptides Impairing CD8+ T Cell Recognition J. Biol. Chem., January 3, 2003; 278(2): 1281 - 1290. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Hulsmeyer, R. C. Hillig, A. Volz, M. Ruhl, W. Schroder, W. Saenger, A. Ziegler, and B. Uchanska-Ziegler HLA-B27 Subtypes Differentially Associated with Disease Exhibit Subtle Structural Alterations J. Biol. Chem., November 27, 2002; 277(49): 47844 - 47853. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Kesmir, A. K. Nussbaum, H. Schild, V. Detours, and S. Brunak Prediction of proteasome cleavage motifs by neural networks Protein Eng. Des. Sel., April 1, 2002; 15(4): 287 - 296. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Schonbach, J. L. Y. Koh, D. R. Flower, L. Wong, and V. Brusic FIMM, a database of functional molecular immunology: update 2002 Nucleic Acids Res., January 1, 2002; 30(1): 226 - 229. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Lucas, D. Bumann, A. Walduck, J. Koesling, L. Develioglu, T. F. Meyer, and T. Aebischer Adoptive Transfer of CD4+ T Cells Specific for Subunit A of Helicobacter pylori Urease Reduces H. pylori Stomach Colonization in Mice in the Absence of Interleukin-4 (IL-4)/IL-13 Receptor Signaling Infect. Immun., March 1, 2001; 69(3): 1714 - 1721. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. v. Essen, P. Dullforce, T. Brocker, and D. Gray Cellular Interactions Involved in Th Cell Memory J. Immunol., October 1, 2000; 165(7): 3640 - 3646. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Schonbach, J. L.Y. Koh, X. Sheng, L. Wong, and V. Brusic FIMM, a database of functional molecular immunology Nucleic Acids Res., January 1, 2000; 28(1): 222 - 224. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Shimbara, K. Ogawa, Y. Hidaka, H. Nakajima, N. Yamasaki, S.-i. Niwa, N. Tanahashi, and K. Tanaka Contribution of Proline Residue for Efficient Production of MHC Class I Ligands by Proteasomes J. Biol. Chem., September 4, 1998; 273(36): 23062 - 23071. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||








