| Nucleic Acids Research | Pages |
AAindex: Amino Acid Index Database
Introduction
Background
The Current Database
AAindex1
AAindex2
Availability
Acknowledgements
References
AAindex: Amino Acid Index Database
ABSTRACT
INTRODUCTION
The variety and specificity of protein three-dimensional structures and biological functions are due to the combination of the 20 different amino acids as specified by the genetic code. The amino acids are the building blocks of proteins each having different characteristics in terms of the shape, the volume, and the chemical reactivity among others. A large body of experimental and theoretical research has been performed to characterize physicochemical and biochemical properties of individual amino acids. The derived property is often represented by a set of 20 numerical values that is called the amino acid index.
In addition to the properties of individual amino acids, the relations between amino acids are also represented by numerical values in the analysis of protein sequences and structures. Especially, the amino acid mutation matrix, also called the amino acid similarity matrix, is the basis for optimization in protein sequence alignments and similarity searches. The amino acid mutation matrix is generally a set of 20 × 20 numerical values, or a set of 210 numerical values since the matrix is usually symmetric. The AAindex database is a collection of published amino acid indices and mutation matrices.
BACKGROUND
In 1988 Nakai et al. collected 222 amino acid indices from research papers and investigated the relationships by the hierarchical cluster analysis (1). They identified four major classes, [alpha]-helix and turn propensities, [beta]-strand propensity, hydrophobicity that can further be divided into subclasses, and other physicochemical properties such as bulkiness of amino acid residues. In 1996 Tomii and Kanehisa (2) increased the size of the collection to include 402 indices and re-performed the clustering. The result was generally in good agreement with the previous work, but for the sake of convenience the collection was divided into six major classes: [alpha] and turn propensities, [beta] propensity, amino acid composition, hydrophobicity, physicochemical properties, and other properties.
Tomii and Kanehisa (2) also collected 42 amino acid mutation matrices from the literature and conducted extensive analysis on the correlations among them and with the amino acid indices. The AAindex database was initiated by Nakai et al. (1), was expanded by Tomii and Kanehisa (2), and is continuously updated by the present authors.
THE CURRENT DATABASE
The AAindex database is a flat file database that consists of two sections: AAindex1 for the amino acid indices and AAindex2 for the amino acid mutation matrices. The format of the two sections is as follows.
AAindex1
Figure 1. An example of the amino acid index entry in the AAindex database (AAindex1). Each record of an entry is identified by the one-letter codes: H, accession number; D, definition of the entry; R, LITDB (3) literature database identifier; A, author(s); T, title of the journal article; J, journal citation information; C, accession numbers of similar entries with the correlation coefficients of 0.8 (-0.8) or more (less); I, actual data in the specified order; and *, optional comments. The AAindex1 section currently contains 434 amino acid indices. A sample entry of AAindex1 is shown in Figure
AAindex2
The AAindex2 section currently contains 66 amino acid mutation matrices: 47 symmetric matrices and 19 non-symmetric matrices. A sample entry of AAindex2 is shown in Figure
AVAILABILITY
The AAindex database can be retrieved through the DBGET/LinkDB system (5) of the Japanese GenomeNet service (6) at http://www.genome.ad.jp/dbget/
The DBGET/LinkDB system integrates most of the major molecular biology databases and is especially suited for using hyperlinks to related entries within the AAindex database as well as to the other databases.
Alternatively, the entire database may be copied and used locally. The URL for anonymous FTP is: ftp://ftp.genome.ad.jp/db/genomenet/aaindex/
Users are requested to cite this article when making use of the AAindex database.
Figure 2. An example of the amino acid mutation matrix entry in the AAindex database (AAindex2). The data format is the same as described in Figure 1. The order of the matrix elements may be computed by the equation or examined in the database documentation file. We thank Drs Kenta Nakai and Kentaro Tomii for the initial developments of the AAindex database. This work was supported in part by the Grant-in-Aid for Scientific Research on the Priority Area `Genome Science' from the Ministry of Education, Science, Sports and Culture of Japan. The computation time was provided by the Supercomputer Laboratory, Institute for Chemical Research, Kyoto University.
ACKNOWLEDGEMENTS
REFERENCES
This article has been cited by other articles:
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 9 Dec 1998
Copyright©Oxford University Press, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
![]()
![]()

![]()
![]()
![]()
Y. Y. Waldman, T. Tuller, R. Sharan, and E. Ruppin
TP53 Cancerous Mutations Exhibit Selection for Translation Efficiency
Cancer Res.,
November 15, 2009;
69(22):
8807 - 8813.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
L. Marsella, F. Sirocco, A. Trovato, F. Seno, and S. C.E. Tosatto
REPETITA: detection and discrimination of the periodicity of protein solenoid repeats by discrete Fourier transform
Bioinformatics,
June 15, 2009;
25(12):
i289 - i295.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
G. Pugalenthi, K. Tang, P. N. Suganthan, and S. Chakrabarti
Identification of structurally conserved residues of proteins in absence of structural homologs using neural network ensemble
Bioinformatics,
January 15, 2009;
25(2):
204 - 210.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
K. Lee, H.-Y. Chuang, A. Beyer, M.-K. Sung, W.-K. Huh, B. Lee, and T. Ideker
Protein networks markedly improve prediction of subcellular localization in multiple eukaryotic species
Nucleic Acids Res.,
November 1, 2008;
36(20):
e136 - e136.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
S. Kawashima, P. Pokarowski, M. Pokarowska, A. Kolinski, T. Katayama, and M. Kanehisa
AAindex: amino acid index database, progress report 2008
Nucleic Acids Res.,
January 11, 2008;
36(suppl_1):
D202 - D205.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
Y. Lu, B. Bulka, M. desJardins, and S. J. Freeland
Amino acid quantitative structure property relationship database: a web-based platform for quantitative investigations of amino acids
Protein Eng. Des. Sel.,
July 1, 2007;
20(7):
347 - 351.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
O. Zimmermann and U. H. E. Hansmann
Support vector machines for prediction of dihedral angle regions
Bioinformatics,
December 15, 2006;
22(24):
3009 - 3015.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
F. Birzele, J. E. Gewehr, and R. Zimmer
QUASAR--scoring and ranking of sequence-structure alignments
Bioinformatics,
December 15, 2005;
21(24):
4425 - 4426.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
R A Sporici, J S Hodskins, D M Locasto, L B Meszaros, A L Ferry, A M Weidner, C A Rinehart, J C Bailey, I M Mains, and S E Diamond
Repression of the prolactin promoter: a functional consequence of the heterodimerization between Pit-1 and Pit-1 {beta}
J. Mol. Endocrinol.,
October 1, 2005;
35(2):
317 - 331.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
D. Huber, D. Boyd, Y. Xia, M. H. Olma, M. Gerstein, and J. Beckwith
Use of Thioredoxin as a Reporter To Identify a Subset of Escherichia coli Signal Sequences That Promote Signal Recognition Particle-Dependent Translocation
J. Bacteriol.,
May 1, 2005;
187(9):
2983 - 2991.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
V. Atalay and R. Cetin-Atalay
Implicit motif distribution based hybrid computational kernel for sequence classification
Bioinformatics,
April 15, 2005;
21(8):
1429 - 1436.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
K. Takano and K. Yutani
A new scale for side-chain contribution to protein stability based on the empirical stability analysis of mutant proteins
Protein Eng. Des. Sel.,
August 1, 2001;
14(8):
525 - 528.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
S. Kawashima and M. Kanehisa
AAindex: Amino Acid index database
Nucleic Acids Res.,
January 1, 2000;
28(1):
374 - 374.
[Abstract]
[Full Text]
[PDF]
![]()
This Article ![]()
![]()
Abstract
![]()
Print PDF (53K)
![]()
Alert me when this article is cited
![]()
Alert me if a correction is posted
![]()
Services ![]()
![]()
Email this article to a friend
![]()
Similar articles in this journal
![]()
Similar articles in ISI Web of Science
![]()
Similar articles in PubMed
![]()
Alert me to new issues of the journal
![]()
Add to My Personal Archive
![]()
Download to citation manager
![]()
Search for citing articles in:
ISI Web of Science (85)
![]()
Request Permissions ![]()
Commercial Re-use Guidelines
for Open Access NAR Content
![]()
Google Scholar ![]()
![]()
Articles by Kawashima, S.
![]()
Articles by Kanehisa, M.
![]()
Search for Related Content
![]()
PubMed ![]()
![]()
PubMed Citation
![]()
Articles by Kawashima, S.
![]()
Articles by Kanehisa, M.
![]()
Social Bookmarking ![]()
![]()
What's this?