Nucleic Acids Research, 2000, Vol. 28, No. 1 77-80
© 2000 Oxford University Press
Integrating functional genomic information into the Saccharomyces Genome Database
Department of Genetics, School of Medicine, Stanford University, Stanford, CA 94305-5120, USA
Received October 1, 1999; Revised and Accepted October 7, 1999.
| ABSTRACT |
|---|
|
|
|---|
The Saccharomyces Genome Database (SGD) stores and organizes information about the nearly 6200 genes in the yeast genome. The information is organized around the locus page and directs users to the detailed information they seek. SGD is endeavoring to integrate the existing information about yeast genes with the large volume of data generated by functional analyses that are beginning to appear in the literature and on web sites. New features will include searches of systematic analyses and Gene Summary Paragraphs that succinctly review the literature for each gene. In addition to current information, such as gene product and phenotype descriptions, the new locus page will also describe a gene products cellular process, function and localization using a controlled vocabulary developed in collaboration with two other model organism databases. We describe these developments in SGD through the newly reorganized locus page. The SGD is accessible via the WWW at http://genome-www. stanford. edu/Saccharomyces/
| GENERAL FORMAT OF SGDs NEW LOCUS PAGE |
|---|
|
|
|---|
The diverse information in the Saccharomyces Genome Database (SGD) is organized around individual genes. The locus page that organizes the information about each gene is critically important to database users. Since the completion of the yeast genomic sequence (1), genome-scale experiments have become increasingly routine. To integrate the large datasets from these experiments with existing information about the 6200 yeast genes, SGD will introduce a redesigned locus page that presents all of the information in a format intuitive to biologists. The locus page will continue to serve as a central information source, from which users will be able to retrieve a large set of data about any gene with minimal navigation. The data on the locus page will also be accessible in tab-delimited and XML formats to facilitate automated data exchange. We present here an overview of the new information, concentrating on two new features, the Gene Summary Paragraph and the function, process and cellular component descriptions, beginning with the way in which these will be organized on the locus page.
To illustrate how SGD plans to incorporate some of the new data that has resulted from the completion of the yeast genomic sequence (1), a prototype of the new locus page for MET16 is illustrated in Figure 1. The left side of Figure 1 lists the information most users want to know about a particular gene, while the blue box on the right side contains links to tools and resources similar to those currently available from SGDs Gene/Sequence Resources page (http://genome-www2.stanford. edu/cgi-bin/SGD/seqTools ). Links to additional information and resources are displayed as orange buttons at the bottom of Figure 1.
|
| BASIC INFORMATION |
|---|
|
|
|---|
Information about a genes standard name and aliases along with gene product and phenotype descriptions will continue to be displayed on SGDs new locus page. A controlled vocabulary to describe mutant phenotypes is being developed to facilitate quick and accurate searches for genes with similar phenotypes.
Three new descriptions will be added to the display on the locus page: function, process and cellular component. These descriptions will come from a controlled vocabulary created by a cross-species project to describe the biological roles of individual gene products. In an on-going collaboration, FlyBase (2), the Mouse Genome Database (MGD) (3) and SGD (4) are developing a Gene Ontology (Ashburner et al., manuscript in preparation and http://genome-www.stanford.edu/GO ), a common vocabulary with defined relationships between controlled vocabulary terms that describes a gene products biological objective, function and localization. The first category in the Gene Ontology is process, which describes the biological role or general cellular objective of the gene product. Examples of biological processes are karyogamy and amino acid bio-synthesis. The second category is function, which describes the elemental activity or task performed by a gene product. DNA binding, ATPase and microtubule motor are examples of gene function. The third category, cellular component, describes subcellular structures, locations, and macromolecular complexes in which a gene product may be found. For instance, mitochondrial outer membrane and spliceosome are cellular components. Even at the level of a single gene, distinguishing between functions and processes enhances its annotation; for example, we can state that a gene encodes a protein kinase (a function) and is involved in cell cycle progression and cellular morphogenesis (two biological processes). This controlled vocabulary will provide a method to identify genes with similar processes, functions or localizations between species.
The usefulness of the distinction between process and function becomes still more apparent when attempting to interpret the results of large-scale experiments. As an example, Figure 2 shows a cluster of co-expressed yeast genes (originally published in ref. 5) with process and function annotated for each gene in the cluster. It is immediately obvious that genes whose products participate in a common process (in Fig. 2, methionine metabolism) tend to be co-expressed under these conditions, while function correlates much less with gene expression. There are other analyses, such as sequence comparisons, for which the function description will better illuminate relationships between genes or gene products.
|
| GENE RESOURCES AND TOOLS |
|---|
|
|
|---|
Shown in the blue box on the right side of Figure 1 are tools and resources to analyze a particular gene. Tools in the new locus page will be organized by topic and selected through pull-down menus. Two links to literature resources will be available: the Gene Info Literature Guide, a resource at SGD that organizes the literature by topic, and a hyperlink to PubMed (http://www.ncbi.nlm.nih.gov/entrez/query.fcgi ) for papers that mention yeast MET16. All of our current DNA and protein sequence retrieval options will be available, including six frame translations with restriction maps and the non-systematic sequences available from GenBank (6). In addition, sequence analysis tools (4), maps (7) and comparison resources (8) will still be provided. A valuable new resource will be the search of functional analysis data, such as DNA microarray analysis (5,9,10) and deletion studies (11), that will display the large-scale systematic results available for a particular gene.
SGD will continue to link to external databases such as YPD (12), MIPS (13), SWISS-PROT (14) and PIR (15,16) so that users can easily find complementary information from other sources. Additional information, such as mapping data and contact information for researchers who work on the selected gene, will be just a click away on the bottom of the page (Fig. 1).
| GENE SUMMARY PARAGRAPH |
|---|
|
|
|---|
Another new feature that will be incorporated at SGD is the Gene Summary Paragraph, which will be a concise summary of the major aspects of the genes biology published in the scientific literature. An example illustrating the Gene Summary Paragraph for MET16 is shown in Figure 3. In addition to being a resource for yeast biologists unfamiliar with a particular gene, these summaries will serve as an introduction to the gene and its product for researchers who may be entering SGD or the area of yeast biology for the first time. As additional genomic sequencing projects are completed and new homologs in other species are identified, it will become increasingly important to convey such information to researchers who have limited expertise in yeast genetics and molecular biology. Written by PhD level biologists at SGD, Gene Summary Paragraphs will be extensively referenced. The list of references used to write the paragraph will be provided at the bottom of the page, along with links to the references abstract, PubMed record, and related PubMed papers. An exhaustive list of references for a given gene can be found in the Gene Info Literature Guide described above. Because SGD enjoys a high level of interaction with its users, we both invite and expect suggestions from the yeast research community to update and improve these gene summaries.
|
| CONCLUSION |
|---|
|
|
|---|
As biological research enters the genomic era, the types and amount of information available can be overwhelming. In anticipation of increasing data from large-scale functional analysis projects and the detection of new sequence homologs, SGD is consolidating and improving the presentation of gene-specific information. Specifically, SGD has entered a collaboration with FlyBase and the MGD to create the Gene Ontology which will describe the important components and relationships that reflect our current understanding of cell biology. It is expected that this cross-species project will allow database users to easily identify gene products with similar biological roles, protein functions or cellular localizations, within or between species, as well as improve the annotations within each of the collaborating databases. Additionally, SGD curators are creating short, concise Gene Summary Paragraphs that describe the published highlights of each genes biology. The gene summaries will provide researchers with a convenient introduction to an unfamiliar topic or model organism.
| ACKNOWLEDGEMENTS |
|---|
S.G.D. is supported by a P41, National Resources, grant from the National Human Genome Research Institute at the US National Institutes of Health. C.R.S. was supported by Human Genome Training Grant post-doctoral fellowship #HG-00044.
| FOOTNOTES |
|---|
* To whom correspondence should be addressed. Tel: +1 650 723 7541; Fax: +1 650 723 7016; Email: cherry@genome.stanford.edu
| REFERENCES |
|---|
|
|
|---|
-
1 Goffeau,A. et al. (1997) Nature, 387, 5.
2 FlyBase Consortium (1999) Nucleic Acids Res., 27, 8588.
3 Blake,J.A., Richardson,J.E., Davisson,M.T. and Eppig,J.T. (1999) Nucleic Acids Res., 27, 9598. Updated article in this issue: Nucleic Acids Res. (2000), 28, 108111.
4 Cherry,J.M., Adler,C., Ball,C., Chervitz,S.A., Dwight,S.S., Hester,E.T., Jia,Y., Juvik,G., Roe,T., Schroeder,M., Weng,S. and Botstein,D. (1998) Nucleic Acids Res., 26, 7379.
5 Spellman,P.T., Sherlock,G., Zhang,M.Q., Iyer,V.R., Anders,K., Eisen,M.B., Brown,P.O., Botstein,D. and Futcher,B. (1998) Mol. Biol. Cell, 9, 32733297.
6 Benson,D.A., Boguski,M.S., Lipman,D.J., Ostell,J., Ouellette,B.F., Rapp,B.A. and Wheeler,D.L. (1999) Nucleic Acids Res., 27, 1217. Updated article in this issue: Nucleic Acids Res. (2000), 28, 1518.
7 Cherry,J.M., Ball,C., Weng,S., Juvik,G., Schmidt,R., Adler,C., Dunn,B., Dwight,S., Riles,L., Mortimer,R.K. and Botstein,D. (1997) Nature, 387, 6773.[Medline]
8 Chervitz,S.A., Hester,E.T., Ball,C.A., Dolinski,K., Dwight,S.S., Harris,M.A., Juvik,G., Malekian,A., Roberts,S., Roe,T., Scafe,C., Schroeder,M., Sherlock,G., Weng,S., Zhu,Y., Cherry,J.M. and Botstein,D. (1999) Nucleic Acids Res., 27, 7478.
9 Eisen,M.B., Spellman,P.T., Brown,P.O. and Botstein,D. (1998) Proc. Natl Acad. Sci. USA, 95, 1486314868.
10 Ferea,T.L., Botstein,D., Brown,P.O. and Rosenzweig,R.F. (1999) Proc. Natl Acad. Sci. USA, 96, 97219726.
11 Winzeler,E.A., Shoemaker,D.D., Astromoff,A., Liang,H., Anderson,K., Andre,B., Bangham,R., Benito,R., Boeke,J.D., Bussey,H., Chu,A.M., Connelly,C., Davis,K., Dietrich,F., Dow,S.W., El Bakkoury,M., Foury,F., Friend,S.H., Gentalen,E., Giaever,G., Hegemann,J.H., Jones,T., Laub,M., Liao,H., Davis,R.W. et al. (1999) Science, 285, 901906.
12 Hodges,P.E., McKee,A.H., Davis,B.P., Payne,W.E. and Garrels,J.I. (1999) Nucleic Acids Res., 27, 6973. Updated article in this issue: Nucleic Acids Res. (2000), 28, 7376.
13 Mewes,H.W., Heumann,K., Kaps,A., Mayer,K., Pfeiffer,F., Stocker,S. and Frishman,D. (1999) Nucleic Acids Res., 27, 4448. Updated article in this issue: Nucleic Acids Res. (2000), 28, 3740.
14 Bairoch,A. and Apweiler,R. (1999) Nucleic Acids Res., 27, 4954. Updated article in this issue: Nucleic Acids Res. (2000), 28, 4548.
15 Srinivasarao,G.Y., Yeh,L.S., Marzec,C.R., Orcutt,B.C., Barker,W.C. and Pfeiffer,F. (1999) Nucleic Acids Res., 27, 284285.
16 Srinivasarao,G.Y., Yeh,L.S., Marzec,C.R., Orcutt,B.C. and Barker,W.C. (1999) Bioinformatics, 15, 382390.
This article has been cited by other articles:
![]() |
D. Banerjee, G. Lelandais, S. Shukla, G. Mukhopadhyay, C. Jacq, F. Devaux, and R. Prasad Responses of Pathogenic and Nonpathogenic Yeast Species to Steroids Reveal the Functioning and Evolution of Multidrug Resistance Transcriptional Networks Eukaryot. Cell, January 1, 2008; 7(1): 68 - 77. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Liang, N. Morozova, A. A. Tokarev, J. W. Mulholland, and N. Segev The Role of Trs65 in the Ypt/Rab Guanine Nucleotide Exchange Factor Function of the TRAPP II Complex Mol. Biol. Cell, July 1, 2007; 18(7): 2533 - 2541. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Z. Wang, Z. Du, R. Payattakool, P. S. Yu, and C.-F. Chen A new method to measure the semantic similarity of GO terms Bioinformatics, May 15, 2007; 23(10): 1274 - 1281. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. R. Ingrell, M. L. Miller, O. N. Jensen, and N. Blom NetPhosYeast: prediction of protein phosphorylation sites in yeast Bioinformatics, April 1, 2007; 23(7): 895 - 897. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. L. Eastmond and H. C. M. Nelson Genome-wide Analysis Reveals New Roles for the Activation Domains of the Saccharomyces cerevisiae Heat Shock Transcription Factor (Hsf1) during the Transient Heat Shock Response J. Biol. Chem., October 27, 2006; 281(43): 32909 - 32921. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. J. Palframan, J. B. Meehl, S. L. Jaspersen, M. Winey, and A. W. Murray Anaphase Inactivation of the Spindle Checkpoint Science, August 4, 2006; 313(5787): 680 - 684. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. B. Ferguson, E. S. Anderson, R. B. Harshaw, T. Thate, N. L. Craig, and H. C. M. Nelson Protein Kinase A Regulates Constitutive Expression of Small Heat-Shock Genes in an Msn2/4p-Independent and Hsf1p-Dependent Manner in Saccharomyces cerevisiae Genetics, March 1, 2005; 169(3): 1203 - 1214. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Zaim, E. Speina, and A. M. Kierzek Identification of New Genes Regulated by the Crt1 Transcription Factor, an Effector of the DNA Damage Checkpoint Pathway in Saccharomyces cerevisiae J. Biol. Chem., January 7, 2005; 280(1): 28 - 37. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Weng, Q. Dong, R. Balakrishnan, K. Christie, M. Costanzo, K. Dolinski, S. S. Dwight, S. Engel, D. G. Fisk, E. Hong, et al. Saccharomyces Genome Database (SGD) provides biochemical and structural information for budding yeast proteins Nucleic Acids Res., January 1, 2003; 31(1): 216 - 218. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. L. Jaspersen, T. H. Giddings Jr., and M. Winey Mps3p is a novel component of the yeast spindle pole body that interacts with the yeast centrin homologue Cdc31p J. Cell Biol., December 23, 2002; 159(6): 945 - 956. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Shedden and S. Cooper Analysis of cell-cycle gene expression in Saccharomyces cerevisiae using microarrays and multiple synchronization methods Nucleic Acids Res., July 1, 2002; 30(13): 2920 - 2929. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Grosu, J. P. Townsend, D. L. Hartl, and D. Cavalieri Pathway Processor: A Tool for Integrating Whole-Genome Expression Results into Metabolic Networks Genome Res., July 1, 2002; 12(7): 1121 - 1126. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. S. Dwight, M. A. Harris, K. Dolinski, C. A. Ball, G. Binkley, K. R. Christie, D. G. Fisk, L. Issel-Tarver, M. Schroeder, G. Sherlock, et al. Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) Nucleic Acids Res., January 1, 2002; 30(1): 69 - 72. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Habeler, K. Natter, G. G. Thallinger, M. E. Crawford, S. D. Kohlwein, and Z. Trajanoski YPL.db: the Yeast Protein Localization database Nucleic Acids Res., January 1, 2002; 30(1): 80 - 83. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Fujibuchi, J. S. J. Anderson, and D. Landsman PROSPECT improves cis-acting regulatory element prediction by integrating expression profile data with consensus pattern searches Nucleic Acids Res., October 1, 2001; 29(19): 3988 - 3996. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. G. O. Consortium Creating the Gene Ontology Resource: Design and Implementation Genome Res., August 1, 2001; 11(8): 1425 - 1433. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. O. Stuart, K. T. Bush, and S. K. Nigam Changes in global gene expression patterns during development and maturation of the rat kidney PNAS, April 25, 2001; (2001) 91110798. [Abstract] [Full Text] |
||||
![]() |
J. Qian, B. Stenger, C. A. Wilson, J. Lin, R. Jansen, S. A. Teichmann, J. Park, W. G. Krebs, H. Yu, V. Alexandrov, et al. PartsList: a web-based system for dynamically ranking protein folds based on disparate attributes, including whole-genome expression and interaction information Nucleic Acids Res., April 15, 2001; 29(8): 1750 - 1764. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. A. Ball, H. Jin, G. Sherlock, S. Weng, J. C. Matese, R. Andrada, G. Binkley, K. Dolinski, S. S. Dwight, M. A. Harris, et al. Saccharomyces Genome Database provides tools to survey gene expression and functional analysis data Nucleic Acids Res., January 1, 2001; 29(1): 80 - 81. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Sherlock, T. Hernandez-Boussard, A. Kasarskis, G. Binkley, J. C. Matese, S. S. Dwight, M. Kaloper, S. Weng, H. Jin, C. A. Ball, et al. The Stanford Microarray Database Nucleic Acids Res., January 1, 2001; 29(1): 152 - 155. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. P. Gasch, P. T. Spellman, C. M. Kao, O. Carmel-Harel, M. B. Eisen, G. Storz, D. Botstein, and P. O. Brown Genomic Expression Programs in the Response of Yeast Cells to Environmental Changes Mol. Biol. Cell, December 1, 2000; 11(12): 4241 - 4257. [Abstract] [Full Text] |
||||
![]() |
D. D. Pollock, J. A. Eisen, N. A. Doggett, and M. P. Cummings A Case for Evolutionary Genomics and the Comprehensive Examination of Sequence Biodiversity Mol. Biol. Evol., December 1, 2000; 17(12): 1776 - 1788. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. A. Jelinsky, P. Estep, G. M. Church, and L. D. Samson Regulatory Networks Revealed by Transcriptional Profiling of Damaged Saccharomyces cerevisiae Cells: Rpn4 Links Base Excision Repair with Proteasomes Mol. Cell. Biol., November 1, 2000; 20(21): 8157 - 8167. [Abstract] [Full Text] |
||||
![]() |
J. A. Blake, J. T. Eppig, J. E. Richardson, M. T. Davisson, and the Mouse Genome Database Group The Mouse Genome Database (MGD): expanding genetic and genomic resources for the laboratory mouse Nucleic Acids Res., January 1, 2000; 28(1): 108 - 111. [Abstract] [Full Text] [PDF] |
||||
![]() |
F.-Z. Idrissi, N. Garcia-Reyero, J. B. Fernandez-Larrea, and B. Pina Alternative Mechanisms of Transcriptional Activation by Rap1p J. Biol. Chem., July 6, 2001; 276(28): 26090 - 26098. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. O. Stuart, K. T. Bush, and S. K. Nigam Changes in global gene expression patterns during development and maturation of the rat kidney PNAS, May 8, 2001; 98(10): 5649 - 5654. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||














