Nucleic Acids Research, 2003, Vol. 31, No. 13 3367-3369
© 2003 Oxford University Press
MATRAS: a program for protein 3D structure comparison
Graduate School of Information Science, Nara Institute of Science and Technology, Takayama 8916-5, Ikoma, Nara, Japan
*Tel: +81 743725396; Fax: +81 743725391; Email: takawaba{at}is.aist-nara.ac.jp
Received February 15, 2003; Revised and Accepted April 7, 2003
| ABSTRACT |
|---|
|
|
|---|
The recent accumulation of large amounts of 3D structural data warrants a sensitive and automatic method to compare and classify these structures. We developed a web server for comparing protein 3D structures using the program Matras (http://biunit.aist-nara.ac.jp/matras). An advantage of Matras is its structure similarity score, which is defined as the log-odds of the probabilities, similar to Dayhoff's substitution model of amino acids. This score is designed to detect evolutionarily related (homologous) structural similarities. Our web server has three main services. The first one is a pairwise 3D alignment, which is simply align two structures. A user can assign structures by either inputting PDB codes or by uploading PDB format files in the local machine. The second service is a multiple 3D alignment, which compares several protein structures. This program employs the progressive alignment algorithm, in which pairwise 3D alignments are assembled in the proper order. The third service is a 3D library search, which compares one query structure against a large number of library structures. We hope this server provides useful tools for insights into protein 3D structures.
| INTRODUCTION |
|---|
|
|
|---|
The comparison of protein 3D structures is an important technique in structural biology. Due to the conservation of structural features in evolution, structural similarities provide biologically and evolutionarily interesting insights and help us to predict molecular functions from structures. Recently, the growth of the Protein Data Bank (PDB) has been accelerated by a large scale structure determination project, called structural genomics (1) and thus an automatic comparison of 3D structures has become more important to take advantage of the huge amount of structural data.
We now report a new server for protein 3D structure comparisons (http://biunit.aist-nara.ac.jp/matras). Several automatic servers for protein structure comparisons are already available. Among them, the DALI server (http://www2.ebi.ac.uk/dali/) (2) is the most popular, structural biologists routinely use it after solving structures experimentally. Other servers, such as VAST (http://www.ncbi.nlm.nih.gov/Structure/VAST/vast.shtml) (3), SSAP (http://www.biochem.ucl.ac.uk/cgi-bin/cath/GetSsapRasmol.pl) (4) and CE (http://cl.sdsc.edu/ce.html) (5), are also available. They have their own unique features. We believe that our server has two advantages over the other sites. The first point is its novel structural similarity score, which is defined as the log-odds of two probabilities (6), using a scheme similar to Dayhoff's amino acids substitution score (7). It is designed to detect homologous similarity sensitively. The second point is that besides the 3D library search, our server has various other structure comparison methods, such as multiple 3D structure alignment and self 3D structure alignment.
| BASIC METHOD |
|---|
|
|
|---|
We will briefly explain the outline of our structure comparison method Matras (MArkov TRAnsition of protein Structure evolution) (6). Our structure similarity score is based on the following log-odds formula:
![]() |
j) is the probability that state i changes to state j during evolution, which is obtained using the Markov transition model. The definition of our score is similar to Dayhoff's model of amino acid substitution (7). We used three kinds of scores, but the final evaluation is done by the distance score Sdis, which depends on the distance between the C beta atoms. The alignment is done by the hierarchical alignment heuristics, in which the SSEs (secondary structure elements) are first aligned, and then a residue-based alignment is iteratively performed using the previous alignment. The details were described in our previous paper (6). | PAIRWISE 3D ALIGNMENT |
|---|
|
|
|---|
To compare and align two structures is the most basic procedure for structural comparison. Other structural comparisons, such as multiple alignments and library searches, were developed based on the pairwise 3D alignment. In our web page, a user can assign structures either by inputting the PDB code or by uploading the PDB format files in the user's computer. An alignment, superimposed structures, and various kinds of structural similarities, such as raw score, RMSD, sequence identity are shown. The Z-score is the most sensitive value for detecting homology (6), but when only two structures are aligned, statistical parameters from library searches are not available. To evaluate pairwise similarities, we introduced the following R-score (%):
![]() |
![]() |
| MULTIPLE 3D ALIGNMENT |
|---|
|
|
|---|
A multiple 3D alignment compares several structures belonging to the same superfamily, which provides important biological insight such as conserved sites or conserved structural features. However, it is well known that the problem of multiple sequence alignment is difficult to solve strictly, and that for 3D structures must be much more difficult because of the multi-body properties of 3D structures. To solve the problem within a reasonable computational time, we used the progressive alignment algorithm, which is the most popular heuristics for multiple sequence alignment (8). The progressive alignment consists of the following three steps: (i) calculate pairwise 3D alignments and similarities for all of the protein pairs; (ii) construct a guide tree using the R-score (Eq. 2) by the UPGMA method; (iii) starting from the leaf nodes of the guide tree, progressively align all of the nodes, in order of decreasing similarity. For aligning a group to another group, all of the protein pairs between the two groups are tried, and the best pairwise alignment determines the alignment of the two groups. In other words, our multiple 3D alignment is performed by assembling the results of the pairwise 3D alignments in the proper order. Using our web server, a user can compare up to 10 structures. It also shows the superimposed structures and a dendrogram of structural similarities (Fig. 1).
|
| 3D LIBRARY SEARCH |
|---|
|
|
|---|
This is for searching similar structures of a query structure within a large number of library structures. Among the several services of our server, only this search service returns the result by email, because it requires long computational times (2040 min). A user can upload a PDB file as a query structure and the result will contain a list of similar library structures ranked by the Z-score and all of the pairwise alignments between the query and similar library structures (shown in Fig. 2). Two kinds of library sets, the PDB representative list (updated weekly) and the SCOP (9) domain representative list are available. The latter list is useful when a user wants to know the domain configuration of a query structure.
|
| OTHER SERVICES |
|---|
|
|
|---|
The Matras server contains two other services. The first one is self 3D alignment, which finds internal similarities within one protein structure. It is useful to reveal the repeated structures of proteins. The second one is a standard sequence homology search against the PDB using the BLAST program (10) with a graphical representation of the aligned regions.
| FUTURE PLANS |
|---|
|
|
|---|
We are now preparing to distribute Matras source codes for users who wish to use it in the stand alone environment. We also plan to develop the web database containing all the results of automatic structure classifications and multiple 3D alignments, calculated by Matras.
| ACKNOWLEDGEMENTS |
|---|
We thank Dr Nozomi Nagano, Dr Keiko Matsuda, Dr Kensuke Nakamura and two reviewers for their useful critical comments about Matras server. This work was supported by the Special Coordination Funds Promoting Science and Technology, and the Grant-in-Aid for Scientific Research on Priority Area (C), Genome Information Science, from the MEXT (Ministry of Education, Culture, Sports, Science and Technology, Japan).
| REFERENCES |
|---|
|
|
|---|
- Westbrook,J., Feng,Z., Li,C., Yang,H. and Berman,H.M. (2003) The Protein Data Bank and structural genomics. Nucleic Acids Res., 31, 489491.
[Abstract/Free Full Text] - Holm,L. and Sander,C. (1993) Protein structure comparison by alignment of distance matrix. J. Mol. Biol., 233, 123138.[CrossRef][ISI][Medline]
- Gibrat,J.F., Madej,T. and Bryant,S.H. (1996) Surprising similarities in structure comparison. Curr. Opin. Struct. Biol., 6, 377385.[CrossRef][ISI][Medline]
- Orengo,C.A., Brown,N.P. and Taylor,W.R. (1992) Fast structure alignment for protein databank searching. Proteins, 14, 139167.[CrossRef][ISI][Medline]
- Shindyalov,I.N. and Bourne,P.E. (1998) Protein structure alignment by incremental combinatorial extension (CE) of the optimal path. Protein Eng., 11, 749747.
[Abstract/Free Full Text] - Kawabata,T. and Nishikawa,K. (2000) Protein structure comparison using the Markov transition of evolution. Proteins, 41, 108122.[CrossRef][ISI][Medline]
- Dayhoff,M.O., Schwartz,R.M. and Orcutt,B.C. (1978) A model of evolutionary change in proteins. In Dayhoff,M.O. (ed), Atlas of Protein Sequence and Structure, Vol. 5, Suppl. 3. National Biomedical Research Foundation, Washington DC, pp. 345352.
- Feng,D.F. and Doolittle,R.F. (1987) Progressive sequence alignment as a prerequisite to correct phylogenetic trees. J. Mol. Evol., 25, 351360.[ISI][Medline]
- Lo Conte,L., Brenner,S.E., Tim,J.P., Hubbard,T.J.P., Chothia,C. and Murzin,A.G. (2002) SCOP database in 2002: refinements accommodate structural genomics. Nucleic Acids Res., 30, 264267.
[Abstract/Free Full Text] - Altschul,S.F., Madden,T.L., Schäffer,A.A., Zhang,J., Zhang,Z., Miller,W. and Lipman,D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res., 25, 33893402.
[Abstract/Free Full Text]
This article has been cited by other articles:
![]() |
P. F. Gherardini and M. Helmer-Citterich Structure-based function prediction: approaches and applications Brief Funct Genomic Proteomic, July 3, 2008; (2008) eln030v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Courville, E. Urbankova, C. Rensing, R. Chaloupka, M. Quick, and M. F. M. Cellier Solute Carrier 11 Cation Symport Requires Distinct Residues in Transmembrane Helices 1 and 6 J. Biol. Chem., April 11, 2008; 283(15): 9651 - 9658. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Wakamatsu, N. Nakagawa, S. Kuramitsu, and R. Masui Structural Basis for Different Substrate Specificities of Two ADP-Ribose Pyrophosphatases from Thermus thermophilus HB8 J. Bacteriol., February 1, 2008; 190(3): 1108 - 1117. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Yoshikane, N. Yokochi, M. Yamasaki, K. Mizutani, K. Ohnishi, B. Mikami, H. Hayashi, and T. Yagi Crystal Structure of Pyridoxamine-Pyruvate Aminotransferase from Mesorhizobium loti MAFF303099 J. Biol. Chem., January 11, 2008; 283(2): 1120 - 1127. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Yamada, T. Tamada, M. Kosaka, K. Miyata, S. Fujiki, M. Tano, M. Moriya, M. Yamanishi, E. Honjo, H. Tada, et al. 'Crystal lattice engineering,' an approach to engineer protein crystal contacts by creating intermolecular symmetry: Crystallization and structure determination of a mutant human RNase 1 with a hydrophobic interface of leucines Protein Sci., July 1, 2007; 16(7): 1389 - 1397. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Kukimoto-Niino, T. Takagi, R. Akasaka, K. Murayama, T. Uchikubo-Kamo, T. Terada, M. Inoue, S. Watanabe, A. Tanaka, Y. Hayashizaki, et al. Crystal Structure of the RUN Domain of the RAP2-interacting Protein x J. Biol. Chem., October 20, 2006; 281(42): 31843 - 31853. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Unno, T. Uchida, H. Sugawara, G. Kurisu, T. Sugiyama, T. Yamaya, H. Sakakibara, T. Hase, and M. Kusunoki Atomic Structure of Plant Glutamine Synthetase: A KEY ENZYME FOR PLANT PRODUCTIVITY J. Biol. Chem., September 29, 2006; 281(39): 29287 - 29296. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Friedberg Automated protein function prediction--the genomic challenge Brief Bioinform, September 1, 2006; 7(3): 225 - 242. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Arai, M. Kukimoto-Niino, H. Uda-Tochio, S. Morita, T. Uchikubo-Kamo, R. Akasaka, Y. Etou, Y. Hayashizaki, T. Kigawa, T. Terada, et al. Crystal structure of an enhancer of rudimentary homolog (ERH) at 2.1 A resolution Protein Sci., July 1, 2005; 14(7): 1888 - 1893. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Bramham, C.-T. Thai, D. C. Soares, D. Uhrin, R. T. Ogata, and P. N. Barlow Functional Insights from the Structure of the Multifunctional C345C Domain of C5 of Complement J. Biol. Chem., March 18, 2005; 280(11): 10636 - 10645. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Shapiro and D. Brutlag FoldMiner and LOCK 2: protein structure comparison and motif discovery on the web Nucleic Acids Res., July 1, 2004; 32(suppl_2): W536 - W541. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Guyon, A.-C. Camproux, J. Hochez, and P. Tuffery SA-Search: a web tool for protein structure mining based on a Structural Alphabet Nucleic Acids Res., July 1, 2004; 32(suppl_2): W545 - W548. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||










