| Nucleic Acids Research | Pages |
The Protein Mutant Database
Introduction
Description
Viewing And Retrieving The System
Show mutated sequences
Show 3D structure
Sequence homology search
Summary of mutants at a certain site
Future Direction
Acknowledgements
References
The Protein Mutant Database
ABSTRACT
INTRODUCTION
Protein mutant database (PMD) is a compilation of protein mutant data, providing information on functional and/or structural influences brought about by amino acid mutations at specific positions of a protein (1). Among other mutant databases, PMD is unique in two respects: (i) almost all proteins are included, except for natural mutants of the globin and immunoglobulin families; (ii) natural as well as artificial mutants are covered, including random and site-directed mutants. PMD data have been extracted from literature published in the 1970s up to the middle of 1995. More than 10 000 articles are now recorded, comprising about 81 000 protein mutants. When the project started in 1989, database construction, i.e., reading articles, extracting necessary information and keying-in the data, was carried out at the Protein Engineering Research Institute in Osaka. Since April 1997, all the tasks have been moved to the National Institute of Genetics in Mishima. Table 1 summarizes proteins categorized by the PIR superfamilies that most frequently appeared in PMD, showing how various proteins are contained in our database. The data complied in PMD are based on published literature, not on proteins. That is, each entry in the database corresponds to one article, which may contain several or a number of protein mutants. Each database entry is identified by a serial number and is defined as either natural or artificial, depending on the type of the mutation. For each entry the following items are recorded: `JOURNAL'; `TITLE'; `CROSS- Table 1. DESCRIPTION
Protein superfamily
Entrya
Mutant
1
ras transforming protein
162
767
2
insulin
155
628
3
antithrombin III
143
856
4
cytochrome P450
131
691
5
NGF receptor repeat homology
116
760
6
globin
115
583
7
vertebrate rhodopsin
111
753
8
insulin receptor
109
393
9
coagulation factor X
107
691
10
cytochrome c
104
691
11
leucine-rich [alpha]-2-glycoprotein repeat homology
100
601
12
apolipoprotein A-I
99
428
13
phage T4 lysozyme
80
520
14
poliovirus genome polyprotein
73
485
15
cellular tumor antigen p53
73
672
16
acetylcholine receptor
72
477
17
pol polyprotein
70
949
18
lysozyme c
70
358
19
bacteriorhodopsin
70
430
20
cystic fibrosis transmembrane conductance regulator
69
349
Figure 1. Schematic diagram of the viewing and retrieving system of PMD. Three databases (PMD, PIR and PDB) are integrated to the system. Figure 2. Sample entries of PMD displayed in the viewing system. (a) A plain PMD entry. (b) An entry appeared by clicking `SHOW WITH SEQUENCE'. Mutation sites are displayed in red. (c) A 3D structure window appeared by clicking `SHOW 3D STRUCTURE', showing mutation sites displayed in yellow. Since 1997, we have extended the `CHANGE' description in two ways: (i) type of operation is added to the header `CHANGE', for example, `CHANGE-POINT' stands for point mutation, `CHANGE-DELETE' for sequence deletion; (ii)'CHANGE-CHIMERA' indicates a chimeric protein. In this case, the entire sequence of the chimera is explicitly shown, as it is difficult to describe the sequence by operational words. Details of the description are shown in our web page (http://pmd.ddbj.nig.ac.jp ).
VIEWING AND RETRIEVING THE SYSTEM
Recently, we developed the powerful viewing and retrieving system of PMD, which is integrated with the sequence database, PIR (2), the tertiary structure database, PDB (3), and has world wide web interface (http://pmd.ddbj.nig.ac.jp ). The relationship between the three databases is schematically shown in Figure
Show mutated sequences
The PMD entry only records information on altered sequences in operational words, such as `Cys 117 Ser' or `Ser-Ala 84-94 AWEKDL' (Fig.
Show 3D structure
If a tertiary structure of a wild-type sequence is experimentally determined, the 3D structure is displayed to show mutation sites in a different color. The sequence is linked with any one of the 3D structures in PDB with more than 50% sequence identity. A structure is shown in a Ca wire-frame model, and the mutated sites are displayed in yellow. Various color schemes to highlight secondary structures or solvent accessible surface (6) are available. A sample of the 3D structure view is shown in Figure
Sequence homology search
A sequence homology search against the PMD database can be carried out with any query sequence, pasted on the input area of our web page. Using this function, it is easy to find entries that have related sequences. The search is performed against the wild-type, but not against mutated sequences. The program for sequence homology search was written by one of the authors (T.K.). The algorithm was based on the standard alignment technique (4) and the ktup filtering (5). The user can choose a threshold value of sequence identity between 30 and 100%. A result of the search is displayed as a multiple alignment of wild-type sequences, whose mutated sites are differently colored. A sample of the result is shown in Figure
Figure 3. Sample results of the sequence search against PMD. (a) The first page of the search results. Each entry stands for its own wild-type sequence, whose mutated sites are displayed in red. (b) A summary of mutation change at a specified site, which appears by clicking the bottom line of the first page. By clicking a site of the sequence homology search result, a summary of mutants at the site can be generated from all related PMD entries. An example is shown in Figure
Summary of mutants at a certain site
FUTURE DIRECTION
As the amount of literature concerning mutant proteins increases every year, the task of constructing the database is becoming more difficult. We are currently about three years behind, dealing with articles published in 1995. One way to overcome this problem, is to limit proteins that should be reviewed. We are now planning to deal primarily with those proteins of known structure, in order to cut down the number of articles to be handled. This would reduce the amount of data to one third. Another problem is the complexity of mutation data. In the early stages of the site-directed mutagenesis technique, simple amino acid substitutions were the main form of protein mutations. At this time, however, more complicated and/or larger scaled mutations have frequently been incorporated into natural proteins, and even de novo designed proteins are synthesized. This trend results in the technical difficulty of expressing alterations in a mutant protein in comparison with the wild-type protein. In addition, the concept of mutant proteins itself is becoming relatively obscure, for example, de novo proteins. De novo designed proteins, to which the wild-types cannot uniquely be defined, are excluded since the standard for a mutant in PMD is to be described relative to a natural protein. On the same principle, mutations introduced into chimeric proteins are excluded, although simple chimera made of two natural proteins are included.
ACKNOWLEDGEMENTS
We are indebted to the PMD staff, Kimiko Mimura, Naoko Nakayama, Minako Kuromaru, Kayoko Yamamoto and Rika Kadowaki for constructing the database. The work was supported by a grant-in-aid from the Ministry of Education, Science, Sports and Culture, Japan.
REFERENCES
This article has been cited by other articles:
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 9 Dec 1998
Copyright©Oxford University Press, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
![]()
![]()

![]()
![]()
![]()
B. Contreras-Moreira
3D-footprint: a database for the structural analysis of protein-DNA complexes
Nucleic Acids Res.,
September 18, 2009;
(2009)
gkp781v1.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
Y. Bromberg, J. Overton, C. Vaisse, R. L. Leibel, and B. Rost
In silico mutagenesis: a case study of the melanocortin 4 receptor
FASEB J,
September 1, 2009;
23(9):
3059 - 3069.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
A. Pavelka, E. Chovancova, and J. Damborsky
HotSpot Wizard: a web server for identification of hot spots in protein engineering
Nucleic Acids Res.,
July 1, 2009;
37(suppl_2):
W376 - W383.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
R. Karchin
Next generation tools for the annotation of human SNPs
Brief Bioinform,
January 1, 2009;
10(1):
35 - 52.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
M. M. Gromiha, Y. Yabuki, M. X. Suresh, A. M. Thangakani, M. Suwa, and K. Fukui
TMFunction: database for functional residues in membrane proteins
Nucleic Acids Res.,
January 1, 2009;
37(suppl_1):
D201 - D204.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
M. Lonquety, Z. Lacroix, N. Papandreou, and J. Chomilier
SPROUTS: a database for the evaluation of protein stability upon point mutation
Nucleic Acids Res.,
January 1, 2009;
37(suppl_1):
D374 - D379.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
H. Xi, J. Park, G. Ding, Y.-H. Lee, and Y. Li
SysPIMP: the web-based systematical platform for identifying human disease-related mutated sequences from mass spectrometry
Nucleic Acids Res.,
January 1, 2009;
37(suppl_1):
D913 - D920.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
Y. Bromberg, G. Yachdav, and B. Rost
SNAP predicts effect of mutations on protein function
Bioinformatics,
October 15, 2008;
24(20):
2397 - 2398.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
Y. Bromberg and B. Rost
SNAP: predict effect of non-synonymous polymorphisms on function
Nucleic Acids Res.,
June 28, 2007;
35(11):
3823 - 3835.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
G. Lopez, A. Valencia, and M. Tress
FireDB--a database of functionally important residues from proteins of known structure
Nucleic Acids Res.,
January 12, 2007;
35(suppl_1):
D219 - D223.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
M. R. Stam, E. G.J. Danchin, C. Rancurel, P. M. Coutinho, and B. Henrissat
Dividing the large glycoside hydrolase family 13 into subfamilies: towards improved functional annotations of {alpha}-amylase-related proteins
Protein Eng. Des. Sel.,
December 1, 2006;
19(12):
555 - 562.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
R. R. Gabdoulline, S. Ulbrich, S. Richter, and R. C. Wade
ProSAT2--Protein Structure Annotation Server.
Nucleic Acids Res.,
July 1, 2006;
34(Web Server issue):
W79 - W83.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
L. Y. Yampolsky and A. Stoltzfus
The Exchangeability of Amino Acids in Proteins
Genetics,
August 1, 2005;
170(4):
1459 - 1472.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. Cotter, P. Guda, E. Fahy, and S. Subramaniam
MitoProteome: mitochondrial protein sequence database and annotation system
Nucleic Acids Res.,
January 1, 2004;
32(90001):
D463 - 467.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
A. Schafferhans, J. E. W. Meyer, and S. I. O'Donoghue
The PSSH database of alignments between protein sequences and tertiary structures
Nucleic Acids Res.,
January 1, 2003;
31(1):
494 - 498.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
M. M. Gromiha, J. An, H. Kono, M. Oobatake, H. Uedaira, P. Prabakaran, and A. Sarai
ProTherm, version 2.0: thermodynamic database for proteins and mutants
Nucleic Acids Res.,
January 1, 2000;
28(1):
283 - 285.
[Abstract]
[Full Text]
[PDF]
![]()
This Article ![]()
![]()
Abstract
![]()
Print PDF (436K)
![]()
Alert me when this article is cited
![]()
Alert me if a correction is posted
![]()
Services ![]()
![]()
Email this article to a friend
![]()
Similar articles in this journal
![]()
Similar articles in ISI Web of Science
![]()
Similar articles in PubMed
![]()
Alert me to new issues of the journal
![]()
Add to My Personal Archive
![]()
Download to citation manager
![]()
Search for citing articles in:
ISI Web of Science (21)
![]()
Request Permissions ![]()
Commercial Re-use Guidelines
for Open Access NAR Content
![]()
Google Scholar ![]()
![]()
Articles by Kawabata, T.
![]()
Articles by Nishikawa, K.
![]()
Search for Related Content
![]()
PubMed ![]()
![]()
PubMed Citation
![]()
Articles by Kawabata, T.
![]()
Articles by Nishikawa, K.
![]()
Social Bookmarking ![]()
![]()
What's this?