| Nucleic Acids Research | Pages |
Using the Saccharomyces Genome Database (SGD) for analysis of protein similarities and structure
Introduction
Examining Protein Similarities At SGD
Protein Structure At SGD
Future Directions
Citing SGD
References
Using the Saccharomyces Genome Database (SGD) for analysis of protein similarities and structure
ABSTRACT
INTRODUCTION
The Saccharomyces Genome Database (SGD) exists to provide the scientific community with access to the Saccharomyces cerevisiae sequence and the wealth of associated information. This database includes a variety of biological information, including the complete, annotated DNA and protein sequence along with several tools for sequence analysis. Many of these features have been recently described (1,2). Here we focus on features of SGD that provide users with tools for comparing yeast protein sequences and examining protein structure. Sequence comparisons play a critical role in the initial process of determining the function of specific proteins and also in interpreting new protein sequence data from large-scale genome sequencing projects. There are several sequence comparison tools at SGD. Here, we discuss the Genome-wide Protein Similarity View program, which is a powerful tool for examining protein similarities. Like the expanding base of sequence information, there is also a growing amount of structural information. Sacch3D is a feature of SGD that organizes and presents structural information about yeast proteins and their putative homologs. Familiarity with tools such as those described here will enable molecular biologists and geneticists to gain insight into the function and possible evolution of their protein of interest.
EXAMINING PROTEIN SIMILARITIES AT SGD
The Genome-wide Protein Similarity View (GPSV), at URL http://genome-www.stanford.edu/cgi-bin/SGD/SWA/swaEntryForm.pl , displays, either graphically or in a table, all the ORFs in the S.cerevisiae genome that are similar to a given query ORF based on a Smith-Waterman protein sequence comparison (3). Smith-Waterman comparisons were conducted on a TimeLogic DeCypher II machine using the affine Smith-Waterman application (4). This system uses the pscorer program to calculate a P-value (5). More details of the Smith-Waterman alignment are available at the GPSV help page (URL in Table 1).
Table 1.
The GPSV graphic view, a Java applet, represents all 16 yeast nuclear chromosomes as horizontal black bars, with centromeres and positional coordinates indicated (Fig.
A
![]() |
B
![]() |
Figure 1. The Genome-Wide Protein Similarity View page. Features are discussed in detail in the text. (A) The Genome-Wide Protein Similarity View graphic view; (B) Genome-Wide Protein Similarity View parameters.
Immediately below the graphic display are seven fields that contain additional information about the query and target ORFs (Fig.
Each similarity bar can be clicked to reach more information about the target ORF. Options include links to the SGD Locus and Gene/Sequence Resources pages for the target ORF, an alignment of the query and target amino acid sequences, the DNA Similarity View, which displays the alignment of the target and query ORF DNA sequences, and the Protein Similarity View, where the selected target ORF is used as the query ORF.
The protein similarity data can also be displayed as a table, which can be accessed from the graphic display page or from the ORF input form. The table lists target ORFs in order of decreasing similarity to the query ORF, as determined by P-value; the target ORFs can also be sorted by percent identity. For each target ORF, the table lists the same information and links as the graphic display.
PROTEIN STRUCTURE AT SGD
The Sacch3D feature at SGD (at URL http://genome-www.stanford.edu/Sacch3D/ ) provides structural information for S.cerevisiae proteins by integrating data from SGD and structural databases and presenting it via up-to-date, concise summaries and links to structural resources. Sacch3D supplies researchers both within and outside the yeast community with insight into the structure and putative function of yeast proteins. Structural information for Sacch3D is obtained primarily by BLASTP analysis (6,7) of the Brookhaven Protein Database (PDB) (8,9) to identify all PDB structures with significant sequence (and therefore likely structural) similarity to yeast proteins. Results are updated monthly to keep pace with the growth of the PDB. To reduce the redundancy in the PDB and thus simplify the BLAST analysis, all PDB protein sequences are first clustered into groups of closely related sequences (see the on-line help at the Sacch3D website for details; Table 1) before the BLAST is run. As of September 1998, 18% of yeast proteins have either a known structure or putative homolog in a clustered version of the PDB (Table 2). The Sacch3D search utility provides a structural information page for all ORFs in the yeast genome (example shown in Fig.
Figure 2. Structural information page at Sacch3D. Structural information page for yeast GTP-binding protein GPA2/YER020W.
A

B

Figure 3. (A) A screen shot of the Java Viewer (11) showing the structure of a human single-stranded DNA-binding domain of the RPA70 subunit bound to ssDNA (PDB structure 1JMC; yeast homolog RFA1/YAR009C). Yellow, region of similarity to yeast protein; white, ssDNA; gray, region without similarity to yeast protein; red, disulfide bond. (B) A screen shot of the RasMol viewer (10) showing the human TATA-binding protein complex with TATA element DNA (PDB structure 1TGH; yeast homolog SPT15/YER148W). Gold, [beta] sheets; red, [alpha] helices; thin white ribbon, coil; wide gray ribbon, DNA.
For yeast proteins without a known structure but with significant sequence similarity to proteins with structures contained within PDB, links are available on the structural information page for homology-based models of the yeast protein structure. These models are accessed by links to the external resources ModBase (17) and Swiss-Model (20). Even for yeast proteins that lack significant similarities in the PDB, a variety of useful links are presented. These include links to secondary structure predictions, several pre-computed BLAST reports, and the Emotif (21) and Pfam (22) sequence search programs. Links to Swiss-Prot, Entrez and the NCBI COGs site for the yeast protein are also included.
Table 2.
| N | Percent | |
| Total yeast proteins | 6215 | 100 |
| Identified genetically | 3086 | 50 |
| With known 3D structure(s)a | 52 | 0.8 |
| With PDB homolog(s) | 1110 | 18 |
| With PDB homolog and identified genetically | 848 | 13 |
Other Sacch3D features include: (i) flexible search options using yeast gene or ORF name, PDB identifier, Swiss-Prot identifier/accession, or text; (ii) a special page devoted to S.cerevisiae structures in PDB showing the number of different structures for each yeast protein with links to SGD and Sacch3D; (iii) a structural URLs page. Sacch3D maintains this list of URLs for web sites relevant to the analysis of protein structure and/or function, including links to structural biology resources, 3D viewers, genome-analysis web sites, journals and research groups; (iv) a domains page providing access to yeast proteins based on their SCOP-classified domains. Users can search for domains using a yeast gene/ORF name, a SCOP class number, or a SCOP fold number. Links are also provided to the WebMol Java viewer (11) to illustrate the location of the domain within the context of the 3D structure; (v) an analysis page that performs an electronic version of a Southern blot using the yeast genomic sequence; (vi) a What's New page that lists new features in Sacch3D as well as new yeast protein structures and new protein structures with homology to yeast proteins.
FUTURE DIRECTIONS
Protein sequence analysis and structure prediction will continue to be updated. However, in the future the major new analysis features of SGD will be associated with the many types of functional genomic results that are beginning to be released.
CITING SGD
When referring to Sacch3D and Genome-wide Similarity View at SGD, please cite this publication.
REFERENCES
This article has been cited by other articles:
This page is run by Oxford University Press, Great Clarendon Street, Oxford OX2 6DP, as part of the OUP Journals
Comments and feedback: www-admin{at}oup.co.uk
Last modification: 9 Dec 1998
Copyright©Oxford University Press, 1998.
![]()
CiteULike
Connotea
Del.icio.us What's this?
![]()
![]()

![]()
![]()
![]()
J. Parenteau, M. Durand, S. Veronneau, A.-A. Lacombe, G. Morin, V. Guerin, B. Cecez, J. Gervais-Bird, C.-S. Koh, D. Brunelle, et al.
Deletion of Many Yeast Introns Reveals a Minority of Genes that Require Splicing for Function
Mol. Biol. Cell,
May 1, 2008;
19(5):
1932 - 1941.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
J. H. Graber, G. D. McAllister, and T. F. Smith
Probabilistic prediction of Saccharomyces cerevisiae mRNA 3'-processing sites
Nucleic Acids Res.,
April 15, 2002;
30(8):
1851 - 1858.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
S. S. Dwight, M. A. Harris, K. Dolinski, C. A. Ball, G. Binkley, K. R. Christie, D. G. Fisk, L. Issel-Tarver, M. Schroeder, G. Sherlock, et al.
Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO)
Nucleic Acids Res.,
January 1, 2002;
30(1):
69 - 72.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
Y. Pouliot, J. Gao, Q. J. Su, G. G. Liu, and X. B. Ling
DIAN: A Novel Algorithm for Genome Ontological Classification
Genome Res.,
October 1, 2001;
11(10):
1766 - 1779.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
C. A. Ball, H. Jin, G. Sherlock, S. Weng, J. C. Matese, R. Andrada, G. Binkley, K. Dolinski, S. S. Dwight, M. A. Harris, et al.
Saccharomyces Genome Database provides tools to survey gene expression and functional analysis data
Nucleic Acids Res.,
January 1, 2001;
29(1):
80 - 81.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
M. W. Walberg
Applicability of Yeast Genetics to Neurologic Disease
Arch Neurol,
August 1, 2000;
57(8):
1129 - 1134.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
C. A. Davis, L. Grate, M. Spingola, and M. Ares Jr
Test of intron predictions reveals novel splice sites, alternatively spliced mRNAs and new introns in meiotically regulated genes of yeast
Nucleic Acids Res.,
April 15, 2000;
28(8):
1700 - 1706.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
J. Urano, A. P. Tabancay, W. Yang, and F. Tamanoi
The Saccharomyces cerevisiae Rheb G-protein Is Involved in Regulating Canavanine Resistance and Arginine Uptake
J. Biol. Chem.,
April 6, 2000;
275(15):
11198 - 11206.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
T. Ito, K. Tashiro, S. Muta, R. Ozawa, T. Chiba, M. Nishizawa, K. Yamamoto, S. Kuhara, and Y. Sakaki
From the Cover: Toward a protein-protein interaction map of the budding yeast: A comprehensive system to examine two-hybrid interactions in all possible combinations between the yeast proteins
PNAS,
February 1, 2000;
97(3):
1143 - 1147.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
C. A. Ball, K. Dolinski, S. S. Dwight, M. A. Harris, L. Issel-Tarver, A. Kasarskis, C. R. Scafe, G. Sherlock, G. Binkley, H. Jin, et al.
Integrating functional genomic information into the Saccharomyces Genome Database
Nucleic Acids Res.,
January 1, 2000;
28(1):
77 - 80.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
A. Kumar, K.-H. Cheung, P. Ross-Macdonald, P. S. R. Coelho, P. Miller, and M. Snyder
TRIPLES: a database of gene function in Saccharomyces cerevisiae
Nucleic Acids Res.,
January 1, 2000;
28(1):
81 - 84.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
M. Ringwald, J. T. Eppig, J. A. Kadin, J. E. Richardson, and the Gene Expression Database Group
GXD: a Gene Expression Database for the laboratory mouse: current status and recent enhancements
Nucleic Acids Res.,
January 1, 2000;
28(1):
115 - 119.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
R. Ploger, J. Zhang, D. Bassett, R. Reeves, P. Hieter, M. Boguski, and F. Spencer
XREFdb: cross-referencing the genetics and genes of mammals and model organisms
Nucleic Acids Res.,
January 1, 2000;
28(1):
120 - 122.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
M. Attimonelli, N. Altamura, R. Benne, A. Brennicke, J. M. Cooper, D. D'Elia, A. d. Montalvo, B. d. Pinto, M. De Robertis, P. Golik, et al.
MitBASE : a comprehensive and integrated mitochondrial DNA database. The present status
Nucleic Acids Res.,
January 1, 2000;
28(1):
148 - 152.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
R. Sanchez, U. Pieper, N. Mirkovi, P. I. W. de Bakker, E. Wittenstein, and A. ali
MODBASE, a database of annotated comparative protein structure models
Nucleic Acids Res.,
January 1, 2000;
28(1):
250 - 253.
[Abstract]
[Full Text]
[PDF]
![]()
![]()
![]()

![]()
![]()
![]()
B. J. Brown, J.-W. Hyun, S. Duvvuri, P. A. Karplus, and V. Massey
The Role of Glutamine 114 in Old Yellow Enzyme
J. Biol. Chem.,
January 11, 2002;
277(3):
2138 - 2145.
[Abstract]
[Full Text]
[PDF]
![]()
This Article ![]()
![]()
Abstract
![]()
Print PDF (414K)
![]()
Alert me when this article is cited
![]()
Alert me if a correction is posted
![]()
Services ![]()
![]()
Email this article to a friend
![]()
Similar articles in this journal
![]()
Similar articles in ISI Web of Science
![]()
Similar articles in PubMed
![]()
Alert me to new issues of the journal
![]()
Add to My Personal Archive
![]()
Download to citation manager
![]()
Search for citing articles in:
ISI Web of Science (37)
![]()
Request Permissions ![]()
Commercial Re-use Guidelines
for Open Access NAR Content
![]()
Google Scholar ![]()
![]()
Articles by Chervitz, S. A.
![]()
Articles by Botstein, D.
![]()
Search for Related Content
![]()
PubMed ![]()
![]()
PubMed Citation
![]()
Articles by Chervitz, S. A.
![]()
Articles by Botstein, D.
![]()
Social Bookmarking ![]()
![]()
What's this?

