Skip Navigation

This Article
Right arrow Abstract Freely available
Right arrow Print PDF (53K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (85)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Kawashima, S.
Right arrow Articles by Kanehisa, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kawashima, S.
Right arrow Articles by Kanehisa, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research Pages 368-369  


AAindex: Amino Acid Index Database
Introduction
Background
The Current Database
   AAindex1
   AAindex2
Availability
Acknowledgements
References


AAindex: Amino Acid Index Database

AAindex: Amino Acid Index Database

Shuichi Kawashima, Hiroyuki Ogata and Minoru Kanehisa*

Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan

Received September 8, 1998; Accepted October 15, 1998

ABSTRACT

AAindex is a database of numerical indices representing various physicochemical and biochemical properties of amino acids and pairs of amino acids. It consists of two sections: AAindex1 for the amino acid index of 20 numerical values and AAindex2 for the amino acid mutation matrix of 210 numerical values. Each entry of either AAindex1 or AAindex2 consists of the definition, the reference information, a list of related entries in terms of the correlation coefficient, and the actual data. The database may be accessed through the DBGET/LinkDB system at GenomeNet (http://www.genome.ad.jp/dbget/ ) or may be downloaded by anonymous FTP (ftp://ftp.genome.ad.jp/db/genomenet/aaindex/ ).

INTRODUCTION

The variety and specificity of protein three-dimensional structures and biological functions are due to the combination of the 20 different amino acids as specified by the genetic code. The amino acids are the building blocks of proteins each having different characteristics in terms of the shape, the volume, and the chemical reactivity among others. A large body of experimental and theoretical research has been performed to characterize physicochemical and biochemical properties of individual amino acids. The derived property is often represented by a set of 20 numerical values that is called the amino acid index.

In addition to the properties of individual amino acids, the relations between amino acids are also represented by numerical values in the analysis of protein sequences and structures. Especially, the amino acid mutation matrix, also called the amino acid similarity matrix, is the basis for optimization in protein sequence alignments and similarity searches. The amino acid mutation matrix is generally a set of 20 × 20 numerical values, or a set of 210 numerical values since the matrix is usually symmetric. The AAindex database is a collection of published amino acid indices and mutation matrices.

BACKGROUND

In 1988 Nakai et al. collected 222 amino acid indices from research papers and investigated the relationships by the hierarchical cluster analysis (1). They identified four major classes, [alpha]-helix and turn propensities, [beta]-strand propensity, hydrophobicity that can further be divided into subclasses, and other physicochemical properties such as bulkiness of amino acid residues. In 1996 Tomii and Kanehisa (2) increased the size of the collection to include 402 indices and re-performed the clustering. The result was generally in good agreement with the previous work, but for the sake of convenience the collection was divided into six major classes: [alpha] and turn propensities, [beta] propensity, amino acid composition, hydrophobicity, physicochemical properties, and other properties.

Tomii and Kanehisa (2) also collected 42 amino acid mutation matrices from the literature and conducted extensive analysis on the correlations among them and with the amino acid indices. The AAindex database was initiated by Nakai et al. (1), was expanded by Tomii and Kanehisa (2), and is continuously updated by the present authors.

THE CURRENT DATABASE

The AAindex database is a flat file database that consists of two sections: AAindex1 for the amino acid indices and AAindex2 for the amino acid mutation matrices. The format of the two sections is as follows.

AAindex1


Figure 1. An example of the amino acid index entry in the AAindex database (AAindex1). Each record of an entry is identified by the one-letter codes: H, accession number; D, definition of the entry; R, LITDB (3) literature database identifier; A, author(s); T, title of the journal article; J, journal citation information; C, accession numbers of similar entries with the correlation coefficients of 0.8 (-0.8) or more (less); I, actual data in the specified order; and *, optional comments.

The AAindex1 section currently contains 434 amino acid indices. A sample entry of AAindex1 is shown in Figure 1. Each entry consists of an accession number, a short description on the index, the reference information, and the numerical values for the property of 20 amino acids. In addition, it contains neighbor information; namely, the cross-links to other entries with an absolute value for the correlation coefficient of 0.8 or larger. With the links the user can identify a set of entries describing similar properties. In some instances the values are not reported for all 20 amino acids. When available we adopt the estimates by Kidera et al. (4) who tried to fill missing values by statistical considerations. When the estimates were not available, the missing values were either replaced by the mean value of the rest or simply filled with zeros.

AAindex2

The AAindex2 section currently contains 66 amino acid mutation matrices: 47 symmetric matrices and 19 non-symmetric matrices. A sample entry of AAindex2 is shown in Figure 2. The format of the entry is almost the same as that of AAindex1 except that it contains 210 numerical values (20 diagonal and 20 × 19/2 off-diagonal elements) for a symmetric matrix and 400 or more numerical values for a non-symmetric matrix (some matrices include a gap or distinguish two states of cysteine).

AVAILABILITY

The AAindex database can be retrieved through the DBGET/LinkDB system (5) of the Japanese GenomeNet service (6) at http://www.genome.ad.jp/dbget/

The DBGET/LinkDB system integrates most of the major molecular biology databases and is especially suited for using hyperlinks to related entries within the AAindex database as well as to the other databases.

Alternatively, the entire database may be copied and used locally. The URL for anonymous FTP is: ftp://ftp.genome.ad.jp/db/genomenet/aaindex/

Users are requested to cite this article when making use of the AAindex database.


Figure 2. An example of the amino acid mutation matrix entry in the AAindex database (AAindex2). The data format is the same as described in Figure 1. The order of the matrix elements may be computed by the equation or examined in the database documentation file.

ACKNOWLEDGEMENTS

We thank Drs Kenta Nakai and Kentaro Tomii for the initial developments of the AAindex database. This work was supported in part by the Grant-in-Aid for Scientific Research on the Priority Area `Genome Science' from the Ministry of Education, Science, Sports and Culture of Japan. The computation time was provided by the Supercomputer Laboratory, Institute for Chemical Research, Kyoto University.

REFERENCES

1. Nakai,K., Kidera,A. and Kanehisa,M. (1988) Protein Engng., 2, 93-100.

2. Tomii,K. and Kanehisa,M. (1996) Protein Engng., 9, 27-36.

3. Seto,Y., Ihara,S., Kohtsuki,S., Ooi,T. and Sakakibara,S. (1988) In Lesk,A.M. (ed.), Computational Molecular Biology. Oxford University Press, New York, pp. 27-37.

4. Kidera,A., Konishi,Y., Oka,M., Ooi,T. and Scheraga,H.A. (1985) J. Protein Chem., 4, 23-55.

5. Fujibuchi,W., Goto,S., Migimatsu,H., Uchiyama,I., Ogiwara,A., Akiyama,Y. and Kanehisa,M. (1998) Pacific Symp. Biocomput.,1998, 683-694.

6. Kanehisa,M. (1997) Trends Biochem. Sci., 22, 442-444. MEDLINE Abstract


*To whom correspondence should be addressed. Tel: +81 774 38 3270; Fax: +81 774 38 3269; Email: kanehisa@kuicr.kyoto-u.ac.jp


This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 9 Dec 1998
Copyright©Oxford University Press, 1998.

Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Cancer Res.Home page
Y. Y. Waldman, T. Tuller, R. Sharan, and E. Ruppin
TP53 Cancerous Mutations Exhibit Selection for Translation Efficiency
Cancer Res., November 15, 2009; 69(22): 8807 - 8813.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
L. Marsella, F. Sirocco, A. Trovato, F. Seno, and S. C.E. Tosatto
REPETITA: detection and discrimination of the periodicity of protein solenoid repeats by discrete Fourier transform
Bioinformatics, June 15, 2009; 25(12): i289 - i295.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
G. Pugalenthi, K. Tang, P. N. Suganthan, and S. Chakrabarti
Identification of structurally conserved residues of proteins in absence of structural homologs using neural network ensemble
Bioinformatics, January 15, 2009; 25(2): 204 - 210.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
K. Lee, H.-Y. Chuang, A. Beyer, M.-K. Sung, W.-K. Huh, B. Lee, and T. Ideker
Protein networks markedly improve prediction of subcellular localization in multiple eukaryotic species
Nucleic Acids Res., November 1, 2008; 36(20): e136 - e136.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Kawashima, P. Pokarowski, M. Pokarowska, A. Kolinski, T. Katayama, and M. Kanehisa
AAindex: amino acid index database, progress report 2008
Nucleic Acids Res., January 11, 2008; 36(suppl_1): D202 - D205.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
Y. Lu, B. Bulka, M. desJardins, and S. J. Freeland
Amino acid quantitative structure property relationship database: a web-based platform for quantitative investigations of amino acids
Protein Eng. Des. Sel., July 1, 2007; 20(7): 347 - 351.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
O. Zimmermann and U. H. E. Hansmann
Support vector machines for prediction of dihedral angle regions
Bioinformatics, December 15, 2006; 22(24): 3009 - 3015.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
F. Birzele, J. E. Gewehr, and R. Zimmer
QUASAR--scoring and ranking of sequence-structure alignments
Bioinformatics, December 15, 2005; 21(24): 4425 - 4426.
[Abstract] [Full Text] [PDF]


Home page
J Mol EndocrinolHome page
R A Sporici, J S Hodskins, D M Locasto, L B Meszaros, A L Ferry, A M Weidner, C A Rinehart, J C Bailey, I M Mains, and S E Diamond
Repression of the prolactin promoter: a functional consequence of the heterodimerization between Pit-1 and Pit-1 {beta}
J. Mol. Endocrinol., October 1, 2005; 35(2): 317 - 331.
[Abstract] [Full Text] [PDF]


Home page
J. Bacteriol.Home page
D. Huber, D. Boyd, Y. Xia, M. H. Olma, M. Gerstein, and J. Beckwith
Use of Thioredoxin as a Reporter To Identify a Subset of Escherichia coli Signal Sequences That Promote Signal Recognition Particle-Dependent Translocation
J. Bacteriol., May 1, 2005; 187(9): 2983 - 2991.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
V. Atalay and R. Cetin-Atalay
Implicit motif distribution based hybrid computational kernel for sequence classification
Bioinformatics, April 15, 2005; 21(8): 1429 - 1436.
[Abstract] [Full Text] [PDF]


Home page
Protein Eng Des SelHome page
K. Takano and K. Yutani
A new scale for side-chain contribution to protein stability based on the empirical stability analysis of mutant proteins
Protein Eng. Des. Sel., August 1, 2001; 14(8): 525 - 528.
[Abstract] [Full Text] [PDF]


Home page
Nucleic Acids ResHome page
S. Kawashima and M. Kanehisa
AAindex: Amino Acid index database
Nucleic Acids Res., January 1, 2000; 28(1): 374 - 374.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (53K) Freely available
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Search for citing articles in:
ISI Web of Science (85)
Right arrowRequest Permissions
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Kawashima, S.
Right arrow Articles by Kanehisa, M.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Kawashima, S.
Right arrow Articles by Kanehisa, M.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?