Article |
The Rice Annotation Project Database (RAP-DB): hub for Oryza sativa ssp. japonica genome information
1Center for Information Biology and DNA Data Bank of Japan, National Institute of Genetics, Research Organization of Information and Systems 1111 Yata, Mishima, Shizuoka 411-8540, Japan 2Tsukuba Division, Mitsubishi Space Software Co., Ltd 1-6-1 Takezono, Tsukuba, Ibaraki 305-0032, Japan 3Genome Research Department, National Institute of Agrobiological Sciences 2-1-2 Kannondai, Tsukuba, Ibaraki 305-8602, Japan 4Life Science Systems Division, Fujitsu Limited 1-17-25 Shinkamata, Ota-ku, Tokyo 144-8588, Japan 5Japan Biological Information Research Center, Japan Biological Informatics Consortium 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan 6Biological Information Research Center, National Institute of Advanced Industrial Science and Technology 2-42 Aomi, Koto-ku, Tokyo 135-0064, Japan
*To whom correspondence should be addressed. Tel: +81 29 838 7065; Fax: +81 29 838 7065; Email: taitoh{at}affrc.go.jp
Received August 15, 2005. Revised October 16, 2005. Accepted October 16, 2005.
| ABSTRACT |
|---|
|
|
|---|
With the completion of the rice genome sequencing, a standardized annotation is necessary so that the information from the genome sequence can be fully utilized in understanding the biology of rice and other cereal crops. An annotation jamboree was held in Japan with the aim of annotating and manually curating all the genes in the rice genome. Here we present the Rice Annotation Project Database (RAP-DB), which has been developed to provide access to the annotation data. The RAP-DB has two different types of annotation viewers, BLAST and BLAT search, and other useful features. By connecting the annotations to other rice genomics data, such as full-length cDNAs and Tos17 mutant lines, the RAP-DB serves as a hub for rice genomics. All of the resources can be accessed through http://rapdb.lab.nig.ac.jp/.
| INTRODUCTION |
|---|
|
|
|---|
Rice is considered a model cereal plant because of its small genome size and high degree of chromosomal co-linearity with other major cereal crops such as maize, wheat, barley and sorghum (1,2). The International Rice Genome Sequencing Project (IRGSP), a consortium of publicly funded laboratories from 10 countries, initiated the sequencing of Oryza sativa ssp. japonica cultivar Nipponbare in 1998 using the clone-by-clone sequencing strategy (2). In 2004, the finished-quality sequence of the entire genome was completed and is now available in the public domain (3).
The annotation of the sequence is indispensable in understanding the overall structure and function of the rice genome. However, most of the annotations of the rice genome sequences were obtained by automated methods. Although this provides an overview of the composition of the genes that comprise the genome, limitations in prediction programs often result in probable errors and artifacts among predicted genes. Therefore, in concordance with the completion of the rice genome sequence, the Rice Annotation Project (RAP) was organized in 2004 (T. Itoh et al., manuscript in preparation) with the aim of providing standardized and highly accurate annotations of the rice genome.
To facilitate efficient management of the results of annotation and to establish a platform for integrating the data with other rice resources, an annotation database called the RAP Database (RAP-DB) was developed. The RAP-DB integrates the IRGSP genome sequence and the RAP annotations with other data on rice researches, and makes them available to the public through HTTP access.
| DATABASE CONTENTS |
|---|
|
|
|---|
The RAP-DB contains the IRGSP genome sequence (build 3 assembly) (3) and the RAP loci with corresponding locus IDs representing the annotated genes. Each locus has one or more variant transcript(s) as RAP annotated sequence(s). Predicted protein-coding regions were also employed as RAP predicted loci. The TIGR-transcripts derived from the annotations on the TIGR assembly (4) were added to the RAP-DB by mapping them to the IRGSP genome. Each RAP transcript has the following links: Gene Ontology, motif domain information, full-length cDNA information (5) and so on. Among them, full-length cDNAs are anticipated to be invaluable for rice researches (Figure 1) by providing good evidence of physical clones, and facilitating future experimental researches. Hyperlinks to the Tos17-flanking sequence positions on the chromosomes (6) should be quite useful for application in clarifying gene functions (Figure 1). The RAP-DB also contains a repeat-masked version of the IRGSP genome sequence build 3 as the reference genome sequence for the annotations.
|
| SYSTEM ARCHITECTURE |
|---|
|
|
|---|
The RAP-DB was implemented on PC servers with RedHat Enterprise Linux ES Version 3, Apache web server, MySQL Database server and GBrowse (7). Other common utilities for UNIX were appropriately installed on the servers if necessary. In order to implement the G-integra system, a modified version provided from the H-Invitational Database (H-InvDB) (8) was used. All of the RAP-DB resources are stored in the servers and available through HTTP access.
| DATA ACCESS |
|---|
|
|
|---|
The primary concept of the RAP-DB is to provide simple access for the IRGSP genome sequence and the RAP annotations. Furthermore, the RAP-DB enables integrative access for other rice resources, which will establish a hub for O.sativa ssp. japonica genomics (Figure 1). One of the entry points of the database is search by keywords (http://rapdb.lab.nig.ac.jp/).Descriptions and IDs (http://rapdb.lab.nig.ac.jp/note.html#nomenclature) of the annotations are searched. The other entry points are sequence similarity searches (for details see below).
Annotation browser
All the descriptions of the functional annotations and other related information can be viewed through GBrowse (Figure 2A and B), which provides the main features of the RAP-DB and gives chromosome-oriented access (Figure 2A) for the genome sequence and the annotations. Results of keyword or sequence similarity search are automatically hyperlinked to corresponding annotations stored in GBrowse. GBrowse is a Generic Genome Browser originally developed by Stein et al. (7) whose characteristics are a combination of a relational database and interactive web pages for manipulating and displaying annotations on genomes. An annotation table corresponding to each transcript is also available by clicking on each glyph (Figure 2B). The table is composed of multiple rows that includes Gene Ontology information, motif domain information and so on. Links are provided to other useful databases such as the full-length cDNAs (5) and Tos17 mutant lines (6), and thereby the RAP-DB functions as a hub for rice genome information. Moreover, SVG images are generated, so that the user can edit the graphics of the genomic view.
|
Genome viewer
Genome-scale view of the annotation and comparison of transcripts with those of other species are available through the G-integra system (Figure 2C), which was originally developed as a part of the H-InvDB (8). G-integra is implemented so as to facilitate parallel access for the RAP annotations and numbers of tracks for other species (cDNAs and expressed sequence tags of representative monocots and Arabidopsis thaliana and the like). G-integra and GBrowse are reciprocally hyperlinked and hence the user can easily access both information.
Sequence similarity search
To facilitate access by sequence similarities, two alternative search methods are available (Figure 2). One is BLAT for aligning a given DNA against the genome (9). Hits reported by BLAT are automatically hyperlinked to the corresponding regions in GBrowse. The other is BLAST (10), which is used for searching transcripts and open reading frames. Hits reported by BLAST are automatically hyperlinked to the corresponding annotation tables in GBrowse.
Distributed annotation system (DAS)
Although we wish to use the IRGSP genome and the RAP annotations as the standard references for future rice genomics, it will be of the rice community's benefit to utilize them for third party annotations. Therefore, we made them available through the DAS protocol (11). The URL for the IRGSP genome reference server is http://rapdb.lab.nig.ac.jp/cgi-bin/das/IRGSP.
| FUTURE DIRECTION |
|---|
|
|
|---|
The annotations of the rice genome sequence will be updated as the genome sequence and cDNA sequences are revised. The latest version of the high-quality rice genome sequence (build 4 assembly) has been released recently (T. Sasaki, personal communication). This assembly will be used to update the manual curation of annotation in conjunction with the Second RAP Meeting (RAP2). It is therefore expected to generate additional loci as well as modifications on previous annotations. In addition, we will increase the links for other valuable databases to provide multiple access to various genome information. The RAP-DB will be a bridge to connect the rice genome informatics and the experimental genomics, and an important hub for rice genomics.
| ACKNOWLEDGEMENTS |
|---|
We thank the IRGSP and RAP members for their supports. This work was supported in part by a grant from the Special Coordination Funds for Promoting Science and Technology of the Ministry of Education, Culture, Sports, Science and Technology (MEXT) of Japan. Funding to pay the Open Access publication charges for this article was provided by a grant for the NIAS Genebank Project.
Conflict of interest statement. None declared.
| REFERENCES |
|---|
|
|
|---|
- Moore, G., Devos, K.M., Wang, Z., Gale, M.D. (1995) Cereal genome evolution. Grasses, line up and form a circle Curr. Biol, . 5, 737739[CrossRef][Web of Science][Medline] .
- Sasaki, T. and Burr, B. (2000) International Rice Genome Sequencing Project: the effort to completely sequence the rice genome Curr. Opin. Plant Biol, . 3, 138141[CrossRef][Web of Science][Medline] .
- International Rice Genome Sequencing Project. (2005) The map-based sequence of the rice genome Nature, 436, 793800[CrossRef][Medline] .
- Yuan, Q., Ouyang, S., Wang, A., Zhu, W., Maiti, R., Lin, H., Hamilton, J., Haas, B., Sultana, R., Cheung, F., et al. (2005) The institute for genomic research Osa1 rice genome annotation database Plant Physiol, . 138, 1826
[Abstract/Free Full Text] . - Kikuchi, S., Satoh, K., Nagata, T., Kawagashira, N., Doi, K., Kishimoto, N., Yazaki, J., Ishikawa, M., Yamada, H., Ooka, H., et al. (2003) Collection, mapping, and annotation of over 28 000 cDNA clones from japonica rice Science, 301, 376379
[Abstract/Free Full Text] . - Miyao, A., Tanaka, K., Murata, K., Sawaki, H., Takeda, S., Abe, K., Shinozuka, Y., Onosato, K., Hirochika, H. (2003) Target site specificity of the Tos17 retrotransposon shows a preference for insertion within genes and against insertion in retrotransposon-rich regions of the genome Plant Cell, 15, 17711780
[Abstract/Free Full Text] . - Stein, L.D., Mungall, C., Shu, S., Caudy, M., Mangone, M., Day, A., Nickerson, E., Stajich, J.E., Harris, T.W., Arva, A., et al. (2002) The generic genome browser: a building block for a model organism system database Genome Res, . 12, 15991610
[Abstract/Free Full Text] . - Imanishi, T., Itoh, T., Suzuki, Y., O'Donovan, C., Fukuchi, S., Koyanagi, K.O., Barrero, R.A., Tamura, T., Yamaguchi-Kabata, Y., Tanino, M., et al. (2004) Integrative annotation of 21 037 human genes validated by full-length cDNA clones PLoS Biol, . 2, 856875 .
- Kent, W.J. (2002) BLATthe BLAST-like alignment tool Genome Res, . 12, 656664
[Abstract/Free Full Text] . - Altschul, S.F., Madden, T.L., Schäffer, A.A., Zhang, J., Zhang, Z., Miller, W., Lipman, D.J. (1997) Gapped BLAST and PSI-BLAST: a new generation of protein database search programs Nucleic Acids Res, . 25, 33893402
[Abstract/Free Full Text] . - Dowell, R.D., Jokerst, R.M., Day, A., Eddy, S.R., Stein, L. (2001) The distributed annotation system BMC Bioinformatics, 2, 7[CrossRef][Medline]
.
This article has been cited by other articles:
![]() |
H. Nakamura, M. Muramatsu, M. Hakata, O. Ueno, Y. Nagamura, H. Hirochika, M. Takano, and H. Ichikawa Ectopic Overexpression of The Transcription Factor OsGLK1 Induces Chloroplast Development in Non-Green Rice Cells Plant Cell Physiol., November 1, 2009; 50(11): 1933 - 1949. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.-Y. Jiang, A. Christoffels, R. Ramamoorthy, and S. Ramachandran Expansion Mechanisms and Functional Annotations of Hypothetical Genes in the Rice Genome Plant Physiology, August 1, 2009; 150(4): 1997 - 2008. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Matsushima, N. Kobayashi, Y. Mochizuki, M. Ishii, S. Kawaguchi, T. A. Endo, R. Umetsu, Y. Makita, and T. Toyoda OmicBrowse: a Flash-based high-performance graphics interface for genomic resources Nucleic Acids Res., July 1, 2009; 37(suppl_2): W57 - W62. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Umezawa, T. Sakurai, Y. Totoki, A. Toyoda, M. Seki, A. Ishiwata, K. Akiyama, A. Kurotani, T. Yoshida, K. Mochida, et al. Sequencing and Analysis of Approximately 40 000 Soybean cDNA Clones from a Full-Length-Enriched cDNA Library DNA Res, December 1, 2008; 15(6): 333 - 346. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Han and Q. Zhang Rice Genome Research: Current Status and Future Perspectives The Plant Genome, November 1, 2008; 1(2): 71 - 76. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Menda, R. M. Buels, I. Tecle, and L. A. Mueller A Community-Based Annotation Framework for Linking Solanaceae Genomes with Phenomes Plant Physiology, August 1, 2008; 147(4): 1788 - 1799. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Liang, P. Jaiswal, C. Hebbard, S. Avraham, E. S. Buckler, T. Casstevens, B. Hurwitz, S. McCouch, J. Ni, A. Pujar, et al. Gramene: a growing plant comparative genomics resource Nucleic Acids Res., January 11, 2008; 36(suppl_1): D947 - D953. [Abstract] [Full Text] [PDF] |
||||
![]() |
Rice Annotation Project The Rice Annotation Project Database (RAP-DB): 2008 update Nucleic Acids Res., January 11, 2008; 36(suppl_1): D1028 - D1033. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. O. Gummadova, G. J. Fletcher, A. Moolna, G. T. Hanke, T. Hase, and C. G. Bowsher Expression of multiple forms of ferredoxin NADP+ oxidoreductase in wheat leaves J. Exp. Bot., November 1, 2007; 58(14): 3971 - 3985. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Hori and Y. Watanabe Context Analysis of Termination Codons in mRNA that are Recognized by Plant NMD Plant Cell Physiol., July 1, 2007; 48(7): 1072 - 1078. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Zhu and C. R. Buell Improvement of whole-genome annotation of cereals through comparative analyses Genome Res., March 1, 2007; 17(3): 299 - 310. [Abstract] [Full Text] [PDF] |
||||
![]() |
F.-C. Chen, S.-S. Wang, S.-M. Chaw, Y.-T. Huang, and T.-J. Chuang Plant Gene and Alternatively Spliced Variant Annotator. A Plant Genome Annotation Pipeline for Rice Gene and Alternatively Spliced Variant Identification with Cross-Species Expressed Sequence Tag Conservation from Seven Plant Species Plant Physiology, March 1, 2007; 143(3): 1086 - 1095. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Itoh, T. Tanaka, R. A. Barrero, C. Yamasaki, Y. Fujii, P. B. Hilton, B. A. Antonio, H. Aono, R. Apweiler, R. Bruskiewich, et al. Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana Genome Res., February 1, 2007; 17(2): 175 - 183. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Ouyang, W. Zhu, J. Hamilton, H. Lin, M. Campbell, K. Childs, F. Thibaud-Nissen, R. L. Malek, Y. Lee, L. Zheng, et al. The TIGR Rice Genome Annotation Resource: improvements and new features Nucleic Acids Res., January 12, 2007; 35(suppl_1): D883 - D887. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. Yokoyama, T. Yamashino, Y.-I. Amano, Y. Tajima, A. Imamura, H. Sakakibara, and T. Mizuno Type-B ARR Transcription Factors, ARR10 and ARR12, are Implicated in Cytokinin-Mediated Regulation of Protoxylem Differentiation in Roots of Arabidopsis thaliana Plant Cell Physiol., January 1, 2007; 48(1): 84 - 96. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Nakamura, T. Itoh, and W. Martin Rate and Polarity of Gene Fusion and Fission in Oryza sativa and Arabidopsis thaliana Mol. Biol. Evol., January 1, 2007; 24(1): 110 - 121. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. G. Elsik, K. C. Worley, L. Zhang, N. V. Milshina, H. Jiang, J. T. Reese, K. L. Childs, A. Venkatraman, C. M. Dickens, G. M. Weinstock, et al. Community annotation: Procedures, protocols, and supporting tools Genome Res., November 1, 2006; 16(11): 1329 - 1333. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||









