Nucleic Acids Research, 2002, Vol. 30, No. 1 42-46
© 2002 Oxford University Press
The KEGG databases at GenomeNet
Bioinformatics Center, Institute for Chemical Research, Kyoto University, Uji, Kyoto 611-0011, Japan
Received September 19, 2001; Revised and Accepted September 26, 2001.
| ABSTRACT |
|---|
|
|
|---|
The Kyoto Encyclopedia of Genes and Genomes (KEGG) is the primary database resource of the Japanese GenomeNet service (http://www.genome.ad.jp/) for understanding higher order functional meanings and utilities of the cell or the organism from its genome information. KEGG consists of the PATHWAY database for the computerized knowledge on molecular interaction networks such as pathways and complexes, the GENES database for the information about genes and proteins generated by genome sequencing projects, and the LIGAND database for the information about chemical compounds and chemical reactions that are relevant to cellular processes. In addition to these three main databases, limited amounts of experimental data for microarray gene expression profiles and yeast two-hybrid systems are stored in the EXPRESSION and BRITE databases, respectively. Furthermore, a new database, named SSDB, is available for exploring the universe of all protein coding genes in the complete genomes and for identifying functional links and ortholog groups. The data objects in the KEGG databases are all represented as graphs and various computational methods are developed to detect graph features that can be related to biological functions. For example, the correlated clusters are graph similarities which can be used to predict a set of genes coding for a pathway or a complex, as summarized in the ortholog group tables, and the cliques in the SSDB graph are used to annotate genes. The KEGG databases are updated daily and made freely available (http://www.genome.ad.jp/kegg/).
| INTRODUCTION |
|---|
|
|
|---|
The GenomeNet (http://www.genome.ad.jp/) was established in September 1991 under the Human Genome Program of the then Ministry of Education, Science and Culture of Japan as a network of databases and computational services for genome research and related research areas in molecular and cellular biology (1). The GenomeNet is currently operated by the Bioinformatics Center of Kyoto University focusing more on functional genomics and proteomics but still supporting most of the major molecular biology databases. The primary resource of the GenomeNet is the Kyoto Encyclopedia of Genes and Genomes (KEGG) (2). The KEGG project was initiated in May 1995, aimed at understanding the basic principles, as well as practical utilities, of the relations between genomic information and higher order functional information.
While new high-throughput experimental technologies, such as DNA chips, are continuously developed and elaborated to decipher the genome, it is also extremely important to fully make use of the data and knowledge accumulated by traditional experiments in all areas of biomedical sciences. KEGG computerizes such data and knowledge not as text information to be read by humans but as graph information to be manipulated by machines. The KEGG/PATHWAY database contains reference diagrams for molecular pathways and complexes involving various cellular processes, which can readily be integrated with genomic information. A key to this integration is graph representation (3). Mathematically, a graph is a set of nodes and a set of edges. In KEGG the genome is a graph of genes that are one-dimensionally connected and the pathway is a graph of gene products (mostly proteins but including RNAs and complexes) with more complicated patterns of connections. By matching genes in the genome and gene products in the pathway, KEGG can be utilized to predict protein interaction networks and associated cellular functions.
From the perspective of graph representation, the chemical object of protein three-dimensional structure is a graph consisting of atoms (nodes) and atomic interactions (edges). The molecular biological object of protein sequence is a graph of one-dimensionally connected amino acids (nodes). The KEGG data object is a graph consisting of genes or proteins as its nodes and various types of relations or interactions as its edges, as summarized in Table 1. Thus, KEGG is a complementary resource to the existing databases on sequences and three-dimensional structures, focusing on higher level information about interactions and relations of genes or proteins.
|
| GENES DATABASE |
|---|
|
|
|---|
Gene annotation
As of 7 September 2001 the GENES database contains 240 943 gene entries derived from the complete genomes of 45 bacteria, 10 archaea and 4 eukaryotes (budding yeast, nematode, fruit fly and thale cress) as well as the genomes of human, mouse and fission yeast. This number is larger than any protein sequence database, SWISS-PROT, PIR or PRF, which contain sequence entries representing over 10 000 organisms, and suggests the difficulty of annotating all known proteins. The complete genome sequence is deposited in the public databases of GenBank, EMBL and DDBJ, with the best annotations of individual genes at the time of publication. However, despite the fact that new gene functions are continuously uncovered because of the availability of complete genome information, the annotations are not updated in the public databases except for a few well-maintained genomes. The KEGG/GENES database is a third-party annotation database attempting to incorporate the most up-to-date information and also to provide standardized annotation across species.
The function of the gene annotation in KEGG is to assign ortholog identifiers (4), which is done manually by the web-based annotation tool for the GENES relational database. The ortholog identifier is associated with standard definition of gene function. The ortholog identifier also represents a node of the protein interaction network (pathway or complex) in the PATHWAY database. In fact, the ortholog identifier was introduced as an extension of the EC number for the metabolic pathway in order to automatically generate organism-specific pathways by matching genes in the genome against gene products in the reference pathway.
Access methods
The GENES database can be accessed by three methods, although there are numerous links leading to this database in the KEGG system. First, the text information describing GENES, which is stored in the accession number, gene names and definition fields, can be searched by the DBGET/LinkDB system (5). The search can be made against all organisms, individual organisms or groups of organisms as shown in the KEGG table of contents page (http://www.genome.ad.jp/kegg/kegg2.html). Secondly, the pathway information that is matched with GENES can be examined by the hierarchical text browser (get_htext program). The link specified by KEGG for each organism in the table of contents displays a functionally categorized gene catalog according to reconstructed organism-specific pathways. Thirdly, the positional information of GENES in the chromosome can be examined by the Java-based genome map browser (http://www.genome.ad.jp/kegg/java/launcher.html). Genes are color coded in both the whole view window and the zoom-up window according to the functional categorization.
Whenever available, the original version of the gene catalog is also maintained in KEGG in order to compare with the original authors classification of genes. The gene catalog and the genome map are linked to the original database rather than the GENES database in this case.
| SSDB DATABASE |
|---|
|
|
|---|
Graph of bestbest relations
The SSDB database is a new addition to the KEGG suite of databases. SSDB contains the information about amino acid sequence similarities among all protein-coding genes in the complete genomes, which is computationally generated from the GENES database. All possible pairwise genome comparisons are performed by the SSEARCH program (6), and the gene pairs with the SmithWaterman similarity score of 100 or more are entered in SSDB. As of 7 September 2001 there are 41 745 353 similarity relations derived from 55 x 55 genome comparisons. In addition, SSDB contains the information about best hits and bestbest hits (bidirectional best hits). The relationship between gene x in genome A and gene y in genome B is called bestbest hits when x is the best hit of query y against all genes in A and vice versa, and it is often used as an operational definition of ortholog (7). SSDB is a huge graph consisting of protein-coding genes as its nodes and similarity relations as its edges. We call this graph the protein gene universe, or simply the protein universe.
When only the edges of bestbest relations are considered, the graph becomes much simpler and can be used effectively to find functional links, especially groups of orthologous genes as partial cliques and possible connections among them (A.Nakaya and M.Kanehisa, manuscript in preparation). In comparison with standard sequence similarity searches by BLAST or FASTA, the search result of SSDB is easier to interpret because of the additional information about bestbest hits (Fig. 1A depicts the SSDB graph features).
|
On top of the SSDB graph, additional edges can be included to further identify various functional links. By incorporating the edges that represent adjacent genes on the chromosome, the gene clusters or operons that are conserved among multiple genomes can be identified (Fig. 1B). Other types of edges include common sequence motifs and common folds in the three-dimensional structures. As part of the SSDB database, sequence motifs in PROSITE (8) and Pfam (9) are precomputed for all proteins in the GENES database.
Access methods
The SSDB database is served by a separate server (http://ssdb.genome.ad.jp/). By specifying a gene of an organism, it is possible to search all neighbors of similar sequences above a given threshold, which is equivalent to usual sequence similarity searches, or to search selected neighbors including bestbest neighbors, best neighbors and reverse best neighbors, which tends to produce functionally more meaningful results. SSDB can also be used to find conserved gene clusters (operons) as contiguous sets of bestbest neighbors.
The information on sequence motifs is not fully integrated with SSDB yet. Separate searches can be made for sequence motifs in a given sequence or a given set of sequences, or for sequences with given motifs. Because motifs are precomputed, the search for all proteins with a given Pfam motif, for example, can be performed instantaneously.
SSDB for improving gene annotations
The SSDB database is utilized in other parts of the KEGG system, such as the genome map comparison that displays a dot matrix of similar genes. SSDB is also critical to gene annotations in KEGG. When a new genome sequence is publicly released, it is incorporated into the KEGG/GENES database and the DBGET/LinkDB system usually within 1 or 2 days. However, in order to start assigning ortholog identifiers, the SSDB computation must be performed, which may take up to 1 week depending on the genome size. Then, the annotation of ortholog identifiers is performed manually using GFIT (7) and other tools. Thus, the reconstructed pathways and the resulting gene catalog can be made publicly available several weeks afterwards. In order to cope with the rapidly increasing number of complete genomes, the detection of SSDB cliques is being implemented to partly automate ortholog identifier assignments, as well as to identify missed annotations.
| PATHWAY DATABASE |
|---|
|
|
|---|
Generalized protein interaction network
The data object stored in the PATHWAY database is called the generalized protein interaction network (3,10), or simply the network, which is a network of gene products (nodes) with three types of interactions or relations (edges): enzymeenzyme relations which are two enzymes catalyzing successive reaction steps in the metabolic pathway, direct proteinprotein interactions such as binding and phosphorylation, and gene expression relations involving transcription factors and target gene products. The generalized protein interaction network is drawn manually as a graphical pathway diagram (pathway map), and it is also stored as a set of binary relations. The set of binary relations is a computable form of the network information, but at the moment only the enzymeenzyme relations are maintained (http://www.genome.ad.jp/brite/ECrel/ecrel.xl) where a relation consists of a pair of nodes (enzymes) and an edge (common compound) in between.
As of 7 September 2001 the PATHWAY database contains 5761 entries including 201 reference pathway diagrams and 83 ortholog group tables, as well as 14 960 enzymeenzyme relations. From the manually drawn reference pathways, many organism-specific pathways are automatically generated according to the ortholog identifier assignments in the GENES database. The total number of gene product nodes that appear on the KEGG pathways is approximately 6000, and roughly one-quarter to one-third of the genes in a bacterial or archaeal genome can be mapped to one or more pathway diagrams. The ortholog group tables contain the information about correlated clusters, which are common subgraphs among multiple graphs (11). In this case a correlated cluster represents a relationship between the positional correlation of genes in the genome and the functional correlation of gene products in the network, such as a set of genes in a conserved gene cluster (operon) forming a subpathway or a complex (Fig. 1B). The total number of genes in the KEGG ortholog group tables is approximately 26 000, which is
10% of the total number of genes in the GENES database.
Access methods
The network information of the KEGG/PATHWAY database is hierarchically categorized into four levels. According to our view on the hierarchy and modularity of cellular functions, the top level is categorized into metabolism, genetic information processing, environmental information processing, and the rest named cellular processes. In addition, a new top category of human diseases is being introduced (see Table 2 for the top two levels). The third level corresponds to a pathway diagram and/or an ortholog group table, which is a collection of genes and proteins. The PATHWAY database can best be viewed by following this hierarchy top-down in the KEGG table of contents page (http://www.genome.ad.jp/kegg/kegg2.html) where the top level item of metabolism is designated by Metabolic pathways and the rest of the top level items are designated by Regulatory pathways. Alternatively, the hierarchy may be used bottom-up starting from the KEGG gene catalogs for individual organisms. In addition, the text information describing PATHWAY entries can be searched by the DBGET/LinkDB system.
|
| OTHER DATABASES |
|---|
|
|
|---|
LIGAND
Originally, the LIGAND database (http://www.genome.ad.jp/ligand/) was developed as a value-added database (12) for the enzyme nomenclature of the International Union of Biochemistry and Molecular Biology (IUBMB). This portion is maintained as the ENZYME section of LIGAND, which is linked to and from the KEGG metabolic pathway. Currently, efforts are being made to add more data in the COMPOUND section and the REACTION section of LIGAND. The COMPOUND section contains chemical structures of metabolites and other chemical compounds, including drugs and xenobiotic chemicals, and the REACTION section contains chemical reactions, mostly enzymatic reactions, represented as conversions of chemical structures. As described elsewhere in this issue (13), a web-based chemical structure search is now available for the COMPOUND and REACTION sections, which are stored in the ISIS system.
EXPRESSION
Despite the efforts to establish data repositories for gene expression data, most useful data are dispersed on the World Wide Web, i.e. located at authors FTP sites, possibly because of the lack of mandatory data submission requirement as for sequence data. The KEGG/EXPRESSION database (http://www.genome.ad.jp/kegg/expression/) is not a data repository, but it collects gene expression data from many laboratories in Japan as part of our collaborative research projects. Currently, microarray gene expression data for Synechocystis and Bacillus subtilis are publicly made available in this database.
The EXPRESSION database is handled with the Java-based graphical viewers. Each experiment can be examined with the array image viewer and the Scatter plot viewer, and a series of experiments can be examined by the cluster viewer once hierarchical cluster analysis is performed. All these viewers are tightly integrated with the PATHWAY and GENES databases, so that expression patterns and clusters can be mapped to the KEGG pathways or chromosomal positions, in order to make sense of the expression data.
BRITE
Biomolecular Relations in Information Transmission and Expression (BRITE; http://www.genome.ad.jp/brite/) is a database of binary relations for computation and comparison of graphs involving genes and proteins. It is not a fully developed database yet, but its purpose in KEGG is to expand the collection of the generalized protein interactions that underlie the KEGG pathway diagrams, especially direct proteinprotein interactions obtained by systematic experiments such as yeast two-hybrid systems, and gene expression relations of transcription factors and transcribed gene products. BRITE will integrate the generalized protein interactions with other diverse sets of binary relations, including sequence similarity relations stored in the SSDB database, expression similarity relations obtained by cluster analysis of the EXPRESSION data, positional correlations in the GENES genome maps and cross-reference links between database entries in the LinkDB database, towards automating logical reasoning steps to understand functions.
| OTHER RESOURCES IN GenomeNet |
|---|
|
|
|---|
DBGET/LinkDB
DBGET/LinkDB (http://www.genome.ad.jp/dbget/dbget.links.html) is the backbone retrieval system for all GenomeNet databases including a number of molecular biology databases that are mirrored at the GenomeNet. DBGET is based on a flat-file view of molecular biology databases, where the database is considered as a collection of entries. Because cross-reference information is often provided pointing to related entries in other databases, the web of molecular biology databases is a graph consisting of entries (nodes) and cross-references (edges), which is like the World Wide Web consisting of pages (nodes) and hyperlinks (edges). LinkDB is capable of searching this graph and identify entries that are both directly and indirectly related.
Computational tools
GenomeNet provides various computational services (http://www.genome.ad.jp/SIT/), including sequence similarity searches by BLAST and FASTA against all major sequence databases that are updated daily, and sequence motif search by MOTIF, which is an in-house-developed search system, against major motif libraries.
FTP site
All the KEGG data are freely available to academic users by anonymous FTP (http://www.genome.ad.jp/anonftp/).
| ACKNOWLEDGEMENTS |
|---|
The computational resource was provided by the Bioinformatics Center, Institute for Chemical Research, Kyoto University. This work was supported by grants from the Ministry of Education, Culture, Sports, Science and Technology of Japan, the Japan Society for the Promotion of Science, and the Japan Science and Technology Corporation.
| FOOTNOTES |
|---|
* To whom correspondence should be addressed. Tel: +81 774 38 3270; Fax: +81 774 38 3269; Email: kanehisa{at}kuicr.kyoto-u.ac.jp
| REFERENCES |
|---|
|
|
|---|
-
1 Kanehisa,M. (1997) Linking databases and organisms: GenomeNet resources in Japan. Trends Biochem. Sci., 22, 442444.[Web of Science][Medline]
2 Kanehisa,M. (1997) A database for post-genome analysis. Trends Genet., 13, 375376.[Web of Science][Medline]
3 Kanehisa,M. (2000) Post-genome Informatics. Oxford University Press, Oxford, UK.
4 Kanehisa,M. and Goto,S. (2000) KEGG: Kyoto Encyclopedia of Genes and Genomes. Nucleic Acids Res., 28, 2730.
5 Fujibuchi,W., Goto,S., Migimatsu,H., Uchiyama,I., Ogiwara,A., Akiyama,Y. and Kanehisa,M. (1998) DBGET/LinkDB: an integrated database retrieval system. Pac. Symp. Biocomput., 683694.
6 Pearson,W.R. (1996) Effective protein sequence comparison. Methods Enzymol., 266, 227258.[Web of Science][Medline]
7 Bono,H., Ogata,H., Goto,S. and Kanehisa,M. (1998) Genome Res., 8, 203210.
8 Hofmann,K., Bucher,P., Falquet,L. and Bairoch,A. (1999) The PROSITE database, its status in 1999. Nucleic Acids Res., 27, 215219. Updated article in this issue: Nucleic Acids Res. (2002), 30, 235238.
9 Bateman,A., Birney,E., Durbin,R., Eddy,S.R., Howe,K.L. and Sonnhammer,E.L. (2000) The Pfam protein families database. Nucleic Acids Res., 28, 263266. Updated article in this issue: Nucleic Acids Res. (2002), 30, 276280.
10 Kanehisa,M. (2000) Pathway databases and higher order function. Adv. Protein Chem., 54, 381408.
11 Ogata,H., Fujibuchi,W., Goto,S. and Kanehisa,M. (2000) A heuristic graph comparison algorithm and its application to detect functionally related enzyme clusters. Nucleic Acids Res., 28, 40214028.
12 Suyama,M., Ogiwara,A., Nishioka,T. and Oda,J. (1993) Searching for amino acid sequence motifs among enzymes: the EnzymeReaction Database. Comput. Appl. Biosci., 9, 915.
13 Goto,S., Okuno,Y., Hattori,M., Nishioka,T. and Kanehisa,M. (2001) LIGAND: database of chemical compounds and reactions in biological pathways. Nucleic Acids Res., 30, 402404.
This article has been cited by other articles:
![]() |
E. Kristiansson, P. Hugenholtz, and D. Dalevi ShotgunFunctionalizeR: an R-package for functional comparison of metagenomes Bioinformatics, October 15, 2009; 25(20): 2737 - 2738. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Chen, G. Lin, J. S. Huo, D. Barney, Z. Wang, T. Livshiz, D. J. States, Z. S. Qin, and J. Schwartz Computational and Functional Analysis of Growth Hormone (GH)-Regulated Genes Identifies the Transcriptional Repressor B-Cell Lymphoma 6 (Bc16) as a Participant in GH-Regulated Transcription Endocrinology, August 1, 2009; 150(8): 3645 - 3654. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Marchais, M. Naville, C. Bohn, P. Bouloc, and D. Gautheret Single-pass classification of all noncoding sequences in a bacterial genome using phylogenetic profiles Genome Res., June 1, 2009; 19(6): 1084 - 1092. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Alexeyenko and E. L.L. Sonnhammer Global networks of functional coupling in eukaryotes from comprehensive data integration Genome Res., June 1, 2009; 19(6): 1107 - 1116. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Bindea, B. Mlecnik, H. Hackl, P. Charoentong, M. Tosolini, A. Kirilovsky, W.-H. Fridman, F. Pages, Z. Trajanoski, and J. Galon ClueGO: a Cytoscape plug-in to decipher functionally grouped gene ontology and pathway annotation networks Bioinformatics, April 15, 2009; 25(8): 1091 - 1093. [Abstract] [Full Text] [PDF] |
||||
![]() |
Q. T. Tran, L. Xu, V. Phan, S. B. Goodwin, M. Rahman, V. X. Jin, C. H. Sutter, B. D. Roebuck, T. W. Kensler, E.O. George, et al. Chemical genomics of cancer chemopreventive dithiolethiones Carcinogenesis, March 1, 2009; 30(3): 480 - 486. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Kaur, A. Radovanovic, M. Essack, U. Schaefer, M. Maqungo, T. Kibler, S. Schmeier, A. Christoffels, K. Narasimhan, M. Choolani, et al. Database for exploration of functional context of genes implicated in ovarian cancer Nucleic Acids Res., January 1, 2009; 37(suppl_1): D820 - D823. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Okazaki, M. Igarashi, M. Nishi, M. Sekiya, M. Tajima, S. Takase, M. Takanashi, K. Ohta, Y. Tamura, S. Okazaki, et al. Identification of Neutral Cholesterol Ester Hydrolase, a Key Enzyme Removing Cholesterol from Macrophages J. Biol. Chem., November 28, 2008; 283(48): 33357 - 33364. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Jacob and J.-P. Vert Protein-ligand interaction prediction: an improved chemogenomics approach Bioinformatics, October 1, 2008; 24(19): 2149 - 2156. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Schliebe, R. Strotmann, K. Busse, D. Mitschke, H. Biebermann, L. Schomburg, J. Kohrle, J. Bar, H. Rompler, J. Wess, et al. V2 vasopressin receptor deficiency causes changes in expression and function of renal and hypothalamic components involved in electrolyte and water homeostasis Am J Physiol Renal Physiol, October 1, 2008; 295(4): F1177 - F1190. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Tsesmetzis, M. Couchman, J. Higgins, A. Smith, J. H. Doonan, G. J. Seifert, E. E. Schmidt, I. Vastrik, E. Birney, G. Wu, et al. Arabidopsis Reactome: A Foundation Knowledgebase for Plant Systems Biology PLANT CELL, June 1, 2008; 20(6): 1426 - 1436. [Full Text] [PDF] |
||||
![]() |
J. Stockel, E. A. Welsh, M. Liberton, R. Kunnvakkam, R. Aurora, and H. B. Pakrasi Global transcriptomic analysis of Cyanothece 51142 reveals robust diurnal oscillation of central metabolic processes PNAS, April 22, 2008; 105(16): 6156 - 6161. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. S. Lee, Y. Cho, J.-H. Lee, and S. G. Kang Novel Monofunctional Histidinol-Phosphate Phosphatase of the DDDD Superfamily of Phosphohydrolases J. Bacteriol., April 1, 2008; 190(7): 2629 - 2632. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Sinha, A. S. Adler, Y. Field, H. Y. Chang, and E. Segal Systematic functional characterization of cis-regulatory motifs in human core promoters Genome Res., March 1, 2008; 18(3): 477 - 488. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Goto, A. Yamashita, H. Hirakawa, M. Matsutani, K. Todo, K. Ohshima, H. Toh, K. Miyamoto, S. Kuhara, M. Hattori, et al. Complete Genome Sequence of Finegoldia magna, an Anaerobic Opportunistic Pathogen DNA Res, February 7, 2008; (2008) dsm030v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. L. Roth, C. Mastronardi, A. Lomniczi, H. Wright, R. Cabrera, A. E. Mungenast, S. Heger, H. Jung, C. Dubay, and S. R. Ojeda Expression of a Tumor-Related Gene Network Increases in the Mammalian Hypothalamus at the Time of Female Puberty Endocrinology, November 1, 2007; 148(11): 5147 - 5161. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Maglietta, A. Piepoli, D. Catalano, F. Licciulli, M. Carella, S. Liuni, G. Pesole, F. Perri, and N. Ancona Statistical assessment of functional categories of genes deregulated in pathological conditions by using microarray data Bioinformatics, August 15, 2007; 23(16): 2063 - 2072. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-C. Liu, H.-C. Chen, N.-Y. Wu, and S.-C. Cheng A Novel Splicing Factor, Yju2, Is Associated with NTC and Acts after Prp2 in Promoting the First Catalytic Reaction of Pre-mRNA Splicing Mol. Cell. Biol., August 1, 2007; 27(15): 5403 - 5413. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Nakagawa, Y. Takaki, S. Shimamura, A.-L. Reysenbach, K. Takai, and K. Horikoshi Deep-sea vent {varepsilon}-proteobacterial genomes provide insights into emergence of pathogens PNAS, July 17, 2007; 104(29): 12146 - 12150. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. Wegmann, M. O'Connell-Motherway, A. Zomer, G. Buist, C. Shearman, C. Canchaya, M. Ventura, A. Goesmann, M. J. Gasson, O. P. Kuipers, et al. Complete Genome Sequence of the Prototype Lactic Acid Bacterium Lactococcus lactis subsp. cremoris MG1363 J. Bacteriol., April 15, 2007; 189(8): 3256 - 3270. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Matsuoka, K. Hirooka, and Y. Fujita Organization and Function of the YsiA Regulon of Bacillus subtilis Involved in Fatty Acid Degradation J. Biol. Chem., February 23, 2007; 282(8): 5180 - 5194. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Draghici, S. Sellamuthu, and P. Khatri Babel's tower revisited: a universal resource for cross-referencing across annotation databases Bioinformatics, December 1, 2006; 22(23): 2934 - 2939. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-W. Chen, S.-F. Su, C.-T. Chien, W.-H. Lin, S.-L. Yu, C.-C. Chou, J. J. W. Chen, and P.-C. Yang Titanium dioxide nanoparticles induce emphysema-like lung injury in mice FASEB J, November 1, 2006; 20(13): 2393 - 2395. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Okazaki, M. Igarashi, M. Nishi, M. Tajima, M. Sekiya, S. Okazaki, N. Yahagi, K. Ohashi, K. Tsukamoto, M. Amemiya-Kudo, et al. Identification of a novel member of the carboxylesterase family that hydrolyzes triacylglycerol: a potential role in adipocyte lipolysis. Diabetes, July 1, 2006; 55(7): 2091 - 2097. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Khatri, V. Desai, A. L. Tarca, S. Sellamuthu, D. E. Wildman, R. Romero, and S. Draghici New Onto-Tools: Promoter-Express, nsSNPCounter and Onto-Translate. Nucleic Acids Res., July 1, 2006; 34(Web Server issue): W626 - W631. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Arifuzzaman, M. Maeda, A. Itoh, K. Nishikata, C. Takita, R. Saito, T. Ara, K. Nakahigashi, H.-C. Huang, A. Hirai, et al. Large-scale identification of protein-protein interaction of Escherichia coli K-12 Genome Res., May 1, 2006; 16(5): 686 - 691. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Real and A. O. Henriques Localization of the Bacillus subtilis murB Gene within the dcw Cluster Is Important for Growth and Sporulation. J. Bacteriol., March 1, 2006; 188(5): 1721 - 1732. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Sanchez-Carbayo, N. D. Socci, J. Lozano, F. Saint, and C. Cordon-Cardo Defining Molecular Profiles of Poor Outcome in Patients With Invasive Bladder Cancer Using Oligonucleotide Microarrays J. Clin. Oncol., February 10, 2006; 24(5): 778 - 789. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Kanehisa, S. Goto, M. Hattori, K. F. Aoki-Kinoshita, M. Itoh, S. Kawashima, T. Katayama, M. Araki, and M. Hirakawa From genomics to chemical genomics: new developments in KEGG Nucleic Acids Res., January 1, 2006; 34(suppl_1): D354 - D357. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Winnenburg, T. K. Baldwin, M. Urban, C. Rawlings, J. Kohler, and K. E. Hammond-Kosack PHI-base: a new database for pathogen host interactions Nucleic Acids Res., January 1, 2006; 34(suppl_1): D459 - D464. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Li, S. Zhong, and W. H. Wong Reliable prediction of transcription factor binding sites by phylogenetic verification PNAS, November 22, 2005; 102(47): 16945 - 16950. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Thieme, R. Koebnik, T. Bekel, C. Berger, J. Boch, D. Buttner, C. Caldana, L. Gaigalat, A. Goesmann, S. Kay, et al. Insights into Genome Plasticity and Pathogenicity of the Plant Pathogenic Bacterium Xanthomonas campestris pv. vesicatoria Revealed by the Complete Genome Sequence J. Bacteriol., November 1, 2005; 187(21): 7254 - 7266. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. D. den Hengst, S. A. F. T. van Hijum, J. M. W. Geurts, A. Nauta, J. Kok, and O. P. Kuipers The Lactococcus lactis CodY Regulon: IDENTIFICATION OF A CONSERVED cis-REGULATORY ELEMENT J. Biol. Chem., October 7, 2005; 280(40): 34332 - 34342. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Hu, B. Li, and D. Kihara Limitations and potentials of current motif discovery algorithms Nucleic Acids Res., September 2, 2005; 33(15): 4899 - 4913. [Abstract] [Full Text] [PDF] |
||||
![]() |
Md. S. Kabir, D. Yamashita, S. Koyama, T. Oshima, K. Kurokawa, M. Maeda, R. Tsunedomi, M. Murata, C. Wada, H. Mori, et al. Cell lysis directed by {sigma}E in early stationary phase and effect of induction of the rpoE gene on global gene expression in Escherichia coli Microbiology, August 1, 2005; 151(8): 2721 - 2735. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Croes, F. Couche, S. J. Wodak, and J. van Helden Metabolic PathFinding: inferring relevant pathways in biochemical networks Nucleic Acids Res., July 1, 2005; 33(suppl_2): W326 - W330. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Hu, J. Mellor, J. Wu, T. Yamada, D. Holloway, and C. DeLisi VisANT: data-integrating visual framework for biological networks and modules Nucleic Acids Res., July 1, 2005; 33(suppl_2): W352 - W357. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Khatri, S. Sellamuthu, P. Malhotra, K. Amin, A. Done, and S. Draghici Recent additions and improvements to the Onto-Tools Nucleic Acids Res., July 1, 2005; 33(suppl_2): W762 - W765. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Tokimatsu, N. Sakurai, H. Suzuki, H. Ohta, K. Nishitani, T. Koyama, T. Umezawa, N. Misawa, K. Saito, and D. Shibata KaPPA-View. A Web-Based Analysis Tool for Integration of Transcript and Metabolite Data on Plant Metabolic Pathway Maps Plant Physiology, July 1, 2005; 138(3): 1289 - 1300. [Abstract] [Full Text] [PDF] |
||||
![]() |
O. Hofmann and D. Schomburg Concept-based annotation of enzyme classes Bioinformatics, May 1, 2005; 21(9): 2059 - 2066. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Itoh, S. Goto, T. Akutsu, and M. Kanehisa Fast and accurate database homology search using upper bounds of local alignment scores Bioinformatics, April 1, 2005; 21(7): 912 - 921. [Abstract] [Full Text] [PDF] |
||||
![]() |
K.-i. Kucho, K. Okamoto, Y. Tsuchiya, S. Nomura, M. Nango, M. Kanehisa, and M. Ishiura Global Analysis of Circadian Expression in the Cyanobacterium Synechocystis sp. Strain PCC 6803 J. Bacteriol., March 15, 2005; 187(6): 2190 - 2199. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Tanaka, Y. Tateno, and T. Gojobori Evolution of Vitamin B6 (Pyridoxine) Metabolism by Gain and Loss of Genes Mol. Biol. Evol., February 1, 2005; 22(2): 243 - 250. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Murakami, K. Shimada, M. Kawai, and H. Koga InCeP: Intracellular Pathway Based on mKIAA Protein-Protein Interactions DNA Res, January 1, 2005; 12(5): 379 - 387. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Lee, S. V. Date, A. T. Adai, and E. M. Marcotte A Probabilistic Functional Network of Yeast Genes Science, November 26, 2004; 306(5701): 1555 - 1558. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Xie, L.-s. Chou, A. Cutler, and B. Weimer DNA Macroarray Profiling of Lactococcus lactis subsp. lactis IL1403 Gene Expression during Environmental Stresses Appl. Envir. Microbiol., November 1, 2004; 70(11): 6738 - 6747. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Boekhorst, R. J. Siezen, M.-C. Zwahlen, D. Vilanova, R. D. Pridmore, A. Mercenier, M. Kleerebezem, W. M. de Vos, H. Brussow, and F. Desiere The complete genomes of Lactobacillus plantarum and Lactobacillus johnsonii reveal extensive differences in chromosome organization and gene content Microbiology, November 1, 2004; 150(11): 3601 - 3611. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Li, H. Amri, H. Huang, C. Wu, and V. Papadopoulos Gene and Protein Profiling of the Response of MA-10 Leydig Tumor Cells to Human Chorionic Gonadotropin J Androl, November 1, 2004; 25(6): 900 - 913. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Igarashi, K. F. Aoki, H. Mamitsuka, K.-i. Kuma, and M. Kanehisa The Evolutionary Repertoires of the Eukaryotic-Type ABC Transporters in Terms of the Phylogeny of ATP-binding Domains in Eukaryotes and Prokaryotes Mol. Biol. Evol., November 1, 2004; 21(11): 2149 - 2160. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. M. Marion, A. Regev, E. Segal, Y. Barash, D. Koller, N. Friedman, and E. K. O'Shea Inaugural Article: Sfp1 is a stress- and nutrient-sensitive regulator of ribosomal protein gene expression PNAS, October 5, 2004; 101(40): 14315 - 14322. [Abstract] [Full Text] [PDF] |
||||
![]() |
F. Campagne, S. Neves, C.-w. Chang, L. Skrabanek, P. T. Ram, R. Iyengar, and H. Weinstein Quantitative Information Management for the Biochemical Computation of Cellular Networks Sci. Signal., August 31, 2004; 2004(248): pl11 - pl11. [Abstract] [Full Text] [PDF] |
||||
![]() |
Md. S. Kabir, T. Sagara, T. Oshima, Y. Kawagoe, H. Mori, R. Tsunedomi, and M. Yamada Effects of mutations in the rpoS gene on cell viability and global gene expression under nitrogen starvation in Escherichia coli Microbiology, August 1, 2004; 150(8): 2543 - 2553. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. A. Gutierrez, M. D. Larson, and C. Wilkerson The Plant-Specific Database. Classification of Arabidopsis Proteins Based on Their Phylogenetic Profile Plant Physiology, August 1, 2004; 135(4): 1888 - 1892. [Full Text] [PDF] |
||||
![]() |
N. Zamboni, E. Fischer, D. Laudert, S. Aymerich, H.-P. Hohmann, and U. Sauer The Bacillus subtilis yqjI Gene Encodes the NADP+-Dependent 6-P-Gluconate Dehydrogenase in the Pentose Phosphate Pathway J. Bacteriol., July 15, 2004; 186(14): 4528 - 4534. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Hamatani, T. Daikoku, H. Wang, H. Matsumoto, M. G. Carter, M. S. H. Ko, and S. K. Dey Global gene expression analysis identifies molecular pathways distinguishing blastocyst dormancy and activation PNAS, July 13, 2004; 101(28): 10326 - 10331. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. W. Georgantas III, V. Tanadve, M. Malehorn, S. Heimfeld, C. Chen, L. Carr, F. Martinez-Murillo, G. Riggins, J. Kowalski, and C. I. Civin Microarray and Serial Analysis of Gene Expression Analyses Identify Known and Novel Transcripts Overexpressed in Hematopoietic Stem Cells Cancer Res., July 1, 2004; 64(13): 4434 - 4441. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Khatri, P. Bhavsar, G. Bawa, and S. Draghici Onto-Tools: an ensemble of web-accessible, ontology-based tools for the functional design and interpretation of high-throughput gene expression experiments Nucleic Acids Res., July 1, 2004; 32(suppl_2): W449 - W456. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. J. Paredes, I. Rigoutsos, and E. T. Papoutsakis Transcriptional organization of the Clostridium acetobutylicum genome Nucleic Acids Res., April 1, 2004; 32(6): 1973 - 1981. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Kim, S. H. Yoshimura, K. Hizume, R. L. Ohniwa, A. Ishihama, and K. Takeyasu Fundamental structural units of the Escherichia coli nucleoid revealed by atomic force microscopy Nucleic Acids Res., April 1, 2004; 32(6): 1982 - 1992. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. R. Ueda, S. Hayashi, S. Matsuyama, T. Yomo, S. Hashimoto, S. A. Kay, J. B. Hogenesch, and M. Iino Universality and flexibility in gene expression from bacteria to human PNAS, March 16, 2004; 101(11): 3765 - 3769. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Arita The metabolic world of Escherichia coli is not small PNAS, February 10, 2004; 101(6): 1543 - 1547. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Kanehisa, S. Goto, S. Kawashima, Y. Okuno, and M. Hattori The KEGG resource for deciphering the genome Nucleic Acids Res., January 1, 2004; 32(90001): D277 - 280. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Barre, A. de Daruvar, and A. Blanchard MolliGen, a database dedicated to the comparative genomics of Mollicutes Nucleic Acids Res., January 1, 2004; 32(90001): D307 - 310. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Lelandais, S. Le Crom, F. Devaux, S. Vialette, G. M. Church, C. Jacq, and P. Marc yMGV: a cross-species expression data mining tool Nucleic Acids Res., January 1, 2004; 32(90001): D323 - 325. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Hertz-Fowler, C. S. Peacock, V. Wood, M. Aslett, A. Kerhornou, P. Mooney, A. Tivey, M. Berriman, N. Hall, K. Rutherford, et al. GeneDB: a resource for prokaryotic and eukaryotic organisms Nucleic Acids Res., January 1, 2004; 32(90001): D339 - 343. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Lemer, E. Antezana, F. Couche, F. Fays, X. Santolaria, R.'s Janky, Y. Deville, J. Richelle, and S. J. Wodak The aMAZE LightBench: a web interface to a relational database of cellular processes Nucleic Acids Res., January 1, 2004; 32(90001): D443 - 448. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Akagi, T. Suzuki, R. M. Stephens, N. A. Jenkins, and N. G. Copeland RTCGD: retroviral tagged cancer gene database Nucleic Acids Res., January 1, 2004; 32(90001): D523 - 527. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Yamazaki, M. Kitajima, M. Arita, H. Takayama, H. Sudo, M. Yamazaki, N. Aimi, and K. Saito Biosynthesis of Camptothecin. In Silico and in Vivo Tracer Study from [1-13C]Glucose Plant Physiology, January 1, 2004; 134(1): 161 - 170. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. B. German, M.-A. Roberts, and S. M. Watkins Personal Metabolomics as a Next Generation Nutritional Assessment J. Nutr., December 1, 2003; 133(12): 4260 - 4266. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Claudel-Renard, C. Chevalet, T. Faraut, and D. Kahn Enzyme-specific profiles for genome annotation: PRIAM Nucleic Acids Res., November 15, 2003; 31(22): 6633 - 6639. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Shannon, A. Markiel, O. Ozier, N. S. Baliga, J. T. Wang, D. Ramage, N. Amin, B. Schwikowski, and T. Ideker Cytoscape: A Software Environment for Integrated Models of Biomolecular Interaction Networks Genome Res., November 1, 2003; 13(11): 2498 - 2504. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Arita In Silico Atomic Tracing by Substrate-Product Relationships in Escherichia coli Intermediary Metabolism Genome Res., November 1, 2003; 13(11): 2455 - 2466. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Spirin and L. A. Mirny Protein complexes and functional modules in molecular networks PNAS, October 14, 2003; 100(21): 12123 - 12128. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. Kinsella, D. A. Fitzpatrick, C. J. Creevey, and J. O. McInerney Fatty acid biosynthesis in Mycobacterium tuberculosis: Lateral gene transfer, adaptive evolution, and gene duplication PNAS, September 2, 2003; 100(18): 10320 - 10325. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Xu and J. I. Gordon Inaugural Article: Honor thy symbionts PNAS, September 2, 2003; 100(18): 10452 - 10459. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Raoult, H. Ogata, S. Audic, C. Robert, K. Suhre, M. Drancourt, and J.-M. Claverie Tropheryma whipplei Twist: A Human Pathogenic Actinobacteria With a Reduced Genome Genome Res., August 1, 2003; 13(8): 1800 - 1809. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. L. England, B. E. Shakhnovich, and E. I. Shakhnovich Natural selection of more designable folds: A mechanism for thermophilic adaptation PNAS, July 22, 2003; 100(15): 8727 - 8731. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Kurata, N. Matoba, and N. Shimizu CADLIVE for constructing a large-scale biochemical network based on a simulation-directed notation and its application to yeast cell cycle Nucleic Acids Res., July 15, 2003; 31(14): 4071 - 4084. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Knudsen, C. Workman, T. Sicheritz-Ponten, and C. Friis GenePublisher: automated analysis of DNA microarray data Nucleic Acids Res., July 1, 2003; 31(13): 3471 - 3476. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. E. Boardman, S. G. Oliver, and S. J. Hubbard SiteSeer: visualisation and analysis of transcription factor binding sites in nucleotide sequences Nucleic Acids Res., July 1, 2003; 31(13): 3572 - 3575. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. McDermott and R. Samudrala Bioverse: functional, structural and contextual annotation of proteins and proteomes Nucleic Acids Res., July 1, 2003; 31(13): 3736 - 3737. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. A. T. DOW and S. A. DAVIES Integrative Physiology and Functional Genomics of Epithelial Function in a Genetic Model Organism Physiol Rev, July 1, 2003; 83(3): 687 - 729. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. A. Borisy, P. J. Elliott, N. W. Hurst, M. S. Lee, J. Lehar, E. R. Price, G. Serbedzija, G. R. Zimmermann, M. A. Foley, B. R. Stockwell, et al. Systematic discovery of multicomponent therapeutics PNAS, June 24, 2003; 100(13): 7977 - 7982. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Lu, A. K. Arakaki, H. Lu, and J. Skolnick Multimeric Threading-Based Prediction of Protein-Protein Interactions on a Genomic Scale: Application to the Saccharomyces cerevisiae Proteome Genome Res., June 1, 2003; 13(6): 1146 - 1154. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Kelso, J. Visagie, G. Theiler, A. Christoffels, S. Bardien, D. Smedley, D. Otgaar, G. Greyling, C. V. Jongeneel, M. I. McCarthy, et al. eVOC: A Controlled Vocabulary for Unifying Gene Expression Data Genome Res., June 1, 2003; 13(6): 1222 - 1230. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Koike, Y. Kobayashi, and T. Takagi Kinase Pathway Database: An Integrated Protein-Kinase and NLP-Based Protein-Interaction Resource Genome Res., June 1, 2003; 13(6): 1231 - 1243. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Bono, I. Nikaido, T. Kasukawa, Y. Hayashizaki, and Y. Okazaki Comprehensive Analysis of the Mouse Metabolome Based on the Transcriptome Genome Res., June 1, 2003; 13(6): 1345 - 1349. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Kawai and Y. Hayashizaki DNA Book Genome Res., June 1, 2003; 13(6): 1488 - 1495. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Uchiyama MBGD: microbial genome database for comparative analysis Nucleic Acids Res., January 1, 2003; 31(1): 58 - 62. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. W. A. Buchan, S. C. G. Rison, J. E. Bray, D. Lee, F. Pearl, J. M. Thornton, and C. A. Orengo Gene3D: structural assignments for the biologist and bioinformaticist alike Nucleic Acids Res., January 1, 2003; 31(1): 469 - 473. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Lin, J. Qian, D. Greenbaum, P. Bertone, R. Das, N. Echols, A. Senes, B. Stenger, and M. Gerstein GeneCensus: genome comparisons in terms of metabolic pathway activity and protein family sharing Nucleic Acids Res., October 15, 2002; 30(20): 4574 - 4582. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. A. Afshari Perspective: Microarray Technology, Seeing More Than Spots Endocrinology, June 1, 2002; 143(6): 1983 - 1989. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Goto, Y. Okuno, M. Hattori, T. Nishioka, and M. Kanehisa LIGAND: database of chemical compounds and reactions in biological pathways Nucleic Acids Res., January 1, 2002; 30(1): 402 - 404. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||

























