Skip Navigation


Nucleic Acids Research Advance Access originally published online on October 16, 2007
Nucleic Acids Research 2008 36(Database issue):D854-D859; doi:10.1093/nar/gkm729
This Article
Right arrow Abstract Freely available
Right arrow Print PDF (1016K) Freely available
Right arrow Screen PDF (374K) Freely available
Right arrowOA All Versions of this Article:
36/suppl_1/D854    most recent
gkm729v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Gauthier, N. P.
Right arrow Articles by Jensen, T. S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gauthier, N. P.
Right arrow Articles by Jensen, T. S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?

Nucleic Acids Research, 2008, Vol. 36, Database issue D854-D859
© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article appears in the following Nucleic Acids Research issue: Database issue [View the issue table of contents]

Articles

Cyclebase.org—a comprehensive multi-organism online database of cell-cycle experiments

Nicholas Paul Gauthier1, Malene Erup Larsen1, Rasmus Wernersson1, Ulrik de Lichtenberg1, Lars Juhl Jensen2, Søren Brunak1,* and Thomas Skøt Jensen1

1Center for Biological Sequence Analysis, BioCentrum-DTU, Technical University of Denmark, Building 208, DK-2800 Lyngby, Denmark and 2European Molecular Biology Laboratory, Meyerhofstrasse 1, D-69117 Heidelberg, Germany

*To whom correspondence should be addressed. Tel: +011 45 45 25 24 77; Fax: +011 45 45 93 1585; Email: brunak{at}cbs.dtu.dk

Received August 15, 2007. Accepted September 1, 2007.


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 PRESENTING CYCLEBASE
 OUTLOOK
 REFERENCES
 
The past decade has seen the publication of a large number of cell-cycle microarray studies and many more are in the pipeline. However, data from these experiments are not easy to access, combine and evaluate. We have developed a centralized database with an easy-to-use interface, Cyclebase.org, for viewing and downloading these data. The user interface facilitates searches for genes of interest as well as downloads of genome-wide results. Individual genes are displayed with graphs of expression profiles throughout the cell cycle from all available experiments. These expression profiles are normalized to a common timescale to enable inspection of the combined experimental evidence. Furthermore, state-of-the-art computational analyses provide key information on both individual experiments and combined datasets such as whether or not a gene is periodically expressed and, if so, the time of peak expression. Cyclebase is available at http://www.cyclebase.org.


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 PRESENTING CYCLEBASE
 OUTLOOK
 REFERENCES
 
The cell division cycle is one of the most fundamental processes of life, allowing cells to multiply and faithfully pass on their genetic information to future generations. The full complexity of this process became apparent a decade ago with the first genome-wide microarray studies of the mitotic cell cycle of budding yeast (1,2). Since then, numerous other microarray studies have been published on the cell cycle of the budding yeast Saccharomyces cerevisiae (3,4), the fission yeast Schizosaccharomyces pombe (5,7), human (8) and the plant Arabidopsis thaliana (9).

Accessing, analyzing and comparing these many datasets has unfortunately remained difficult for a variety of reasons. First, there is no single database from which one can download all the datasets in an unified file format. The expression profiles for each experiment are often stored on individual websites. Second, the same gene identifiers are not used across datasets, making it difficult to compare expression profiles from different studies on the same organism. Third, a variety of different methods have, with varying success, been used for identifying the significantly regulated genes (1–28). The use of many different algorithms has introduced uncertainty as to which is the correct set of cell-cycle regulated genes. Fourth, new experimental studies tend to disregard already existing expression data, and thus only evaluate cell-cycle regulation based on their own experiments. Finally, general microarray repositories, analysis methods and visualization tools have by nature not been designed to meet the specific needs of the cell-cycle community.

Here, we present Cyclebase.org, a database and web resource of cell-cycle microarray expression datasets (see Table 1 for an overview of the datasets included in Cyclebase). These datasets have been mapped to common gene identifiers and normalized onto a common timescale, facilitating direct comparison of expression profiles between all experiments within an organism. The web interface provides a good visual overview of all available expression data on a given gene, as well as the results from state-of-the-art computational analyses. This interface aids the user in interpreting the combined evidence on the cell-cycle regulation of a given gene.


View this table:
[in this window]
[in a new window]

 
Table 1. Summary of cell-cycle microarray experiments in Cyclebase

 

    PRESENTING CYCLEBASE
 TOP
 ABSTRACT
 INTRODUCTION
 PRESENTING CYCLEBASE
 OUTLOOK
 REFERENCES
 
The interface of Cyclebase is designed to make it as simple as possible for users to find and browse the genes of interest. Searching for key terms such as standard gene names (e.g. HTA2), systematic names (e.g. YBL003C) or descriptions (e.g. histone) will produce a list of candidate genes for inspection. Genes in this list are initially sorted by their match to the search criteria and then in ascending order on the cell-cycle rank score (most periodic genes at the top). The list can be sorted on any of the other columns simply by clicking them. In addition, an advanced search page allows the user to browse for genes that match certain criteria; for example, it allows researchers to find among the 100 most periodic human genes, those that peak in S-phase.

When a gene of interest has been selected, or if a query is entered that matches only a single gene, the user is taken to the Gene Details page (Figure 1). This page is the primary interface for viewing expression profiles, key results from statistical analyses and general information about the gene in question. By default, the statistical results are based on all available experiments. Expression profiles and analysis results for the individual experiments can be accessed by clicking on a single experiment in the experiments list (Figure 1A).


Figure 1
View larger version (38K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 1. Screenshot for budding yeast CLB1. The figure shows the Gene Details Page for the gene CLB1 (a cyclin). (A) The list of experiments in which the gene is measured. Clicking any of these takes the user to another Gene Details Page with only data from that particular experiment. (B) Expression profile chart. The experiments are normalized and aligned onto a common time-scale (in percent of the cell cycle). The individual phases are marked along the time axis and the computationally determined peaktime is marked by a red dot. (C) Summary of the computational analysis based on all data available for this gene in Cyclebase. ‘Rank’ signifies that this is the 78th most periodic gene in budding yeast, ‘P(per)’ and ‘P(reg)’ are P-values that quantify the significance of periodicity and regulation, respectively, and ‘peaktime’ estimates how far into the cell cycle (from M/G1) the gene is maximally expressed. (D) Schematic illustration of the peaktime (red dot) and phase duration. The gene CLB1 peaks 63 % into the cell cycle, corresponding to the middle of G2 phase in budding yeast. (E) Gene aliases and description. (F) Download of data in various formats.(G) Database documentation and download.

 
To allow for inspection of the accumulated evidence for transcriptional regulation during the cell cycle, all available expression data for a gene of interest are depicted in the expression profile chart (Figure 1B). Easy comparison of different experiments is obtained by placing each profile onto a common time scale, which we have chosen to be in percent of the cell division cycle with zero corresponding to cytokinesis (M/G1-transition) (16,29,30). Such normalization is necessary as the individual experiments vary greatly in their absolute interdivision times, depending on the experimental conditions. Subsequent alignment of the timescales is also necessary, because different experiments release the cells from different points in the cell cycle. Finally, the expression values have been normalized to a standard deviation of one over the entire experiment to further aid comparison across experiments.

To provide an unbiased and comparable assessment of the expression data, a common computational analysis framework has been applied to all datasets in the database. For every expression profile, two P-values are calculated that assess the significance of periodicity and regulation (16). The P-values are summarized across all experiments in an organism and combined to a final score, which is used to rank all genes in the genome (16) (Figure 1C). A brief explanation of the algorithms is provided in the Methods section of Cyclebase.

Based on independent benchmarking, this methodology has previously been proven to be as good as or superior to all other published methods for identifying periodically expressed genes (16,29,30). We have expanded this benchmark to also include recent methods (1,2,5–28) and experiments (Figure 2). Benchmark sets were compiled that are enriched in cell-cycle regulated genes from targets of known cell-cycle transcription factors (16,29,30). We benchmarked each method's; ability to retrieve genes in these sets. Figure 2 displays the benchmarking results, which shows that the method used in Cyclebase provides clear improvements over other methods and that combining all data for an organism is, not surprisingly, superior to any single dataset analyzed on its own. Based on the benchmarks, we have selected a set of significantly periodically expressed genes within each organism (labeled with a small ‘Periodic’ icon). We found 600 periodic genes in budding yeast, 500 in fission yeast, 600 in human and 400 in the plant A. thaliana. For these periodic genes, we compute the ‘peaktime’ based on all available expression profiles (16).


Figure 2
View larger version (38K):
[in this window]
[in a new window]
[Download PowerPoint slide]
 
Figure 2. Benchmark of methods for identifying cell-cycle regulated genes. For each of the four organisms, a benchmark set was compiled of genes whose promoters are bound by known cell-cycle transcription factors (16,29,30), under the assumption that these genes should be highly overlapping with those that display cell-cycle regulation at the transcriptional level (i.e. periodic expression). The panels show the fraction of a benchmark set retrieved as a function of the number of genes suggested for each individual method (1,2,5–28). Better methods should therefore be towards the upper left corner of the plot. Methods which provide a ranked list of genes are displayed as a line, whereas those that only supply an unranked set of genes appear in the plots as cross mark/plus sign. The black dotted line corresponds to picking genes randomly. In all four organisms, the combined analysis of all data within an organism presented by Cyclebase outperforms all existing methods or suggested sets of periodically expressed genes. In all organisms, the curves eventually display the same slope as the random performance curve (black dotted), indicating that including more genes from this point on yields no enrichment in genes from the benchmark set.

 
The peaktime is a measure of when in the cell cycle a given gene is maximally expressed, and represents a summary of all the expression data (16). The peaktime is given as percent into the cell cycle (from when the new cell is born in cytokinesis) and is depicted as a red dot in both the expression profile chart (Figure 1B) and the peaktime chart (Figure 1D). The phase length can vary widely from organism to organism (e.g. G2-phase occupies ~60–70% of the cell cycle in fission yeast versus only ~25% in budding yeast), and the peaktime chart is therefore drawn differently for each species. Consequently, the peaktime values cannot be directly compared across organisms, since a specific percent (e.g. 60%) into the cell cycle may correspond to different phases in different organisms. The peaktime is only computed for genes that display periodicity and the remaining genes are labeled with ‘uncertain’ for the peaktime value. This label is also used if the different experiments disagree too much for a peaktime to be reliably assigned (16).

When comparing expression data across experiments, one issue is that different gene names for the same gene have been used in the different experiments. We have solved this problem by combining expression data and key results based on systematic gene identifiers. When they exist, a list of aliases is provided in the Gene Details page (Figure 1E), allowing the user to relate to the original experiment and to crosslink to external databases. The Gene Details page also contains a functional description (Figure 1E) populated from external databases (31–35) and is therefore not available for all genes.

All Cyclebase analysis results are available for download, both as values for individual genes and as whole-experiment datasets. XML and tab-delimited formats are available, both of which are fully documented on the website. Furthermore, where permission has been granted from the original authors, expression profile datasets are also available for download. Every page in Cyclebase also contains links to information about the database (FAQ and Methods), information about the individual experiments, and a link to the datasets available for download (Figure 1G).


    OUTLOOK
 TOP
 ABSTRACT
 INTRODUCTION
 PRESENTING CYCLEBASE
 OUTLOOK
 REFERENCES
 
Many more cell-cycle experiments may be performed in the future, and we encourage researchers to contact us, so that new cell-cycle experiments are analyzed consistently, and can be included in Cyclebase. As other types of large-scale experiments (e.g. metabolite information, kinase activity or protein expression) become available, it will become imperative that researchers integrate and analyze these data together with existing datasets. Cyclebase has been designed to store diverse data types from time-series experiments and we intend for Cyclebase to become a standard interface and tool for combining cell cycle datasets beyond transcriptional regulation. This would give researchers a one-stop shop for visualizing and downloading time-series events from the cell-cycle.


    ACKNOWLEDGEMENTS
 
The authors wish to thank Hans-Henrik Stæfeldt, Kristoffer Rapacki and Peter W. Sacket for technical help with the database. This work was supported by grants from the Villum Kahn Rasmussen Foundation, the Danish Technical Research Council, as well as the BioSapiens Network of Excellence (LSHG-CT-2003-503265) funded by the European Commission FP6 Programme. Funding to pay the Open Access publication charges for this article was provided by the Villum Kahn Rasmussen Foundation.

Conflict of interest statement. None declared.


    Footnotes
 
Present address: Ulrik de Lichtenberg, LEO Pharma, Industriparken 55, DK-2750 Ballerup, Denmark.


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 PRESENTING CYCLEBASE
 OUTLOOK
 REFERENCES
 

  1. Cho RJ, Campbell MJ, Winzeler EA, Steinmetz L, Conway A, Wodicka L, Wolfsberg TG, Gabrielian AE, et al. A genome-wide transcriptional analysis of the mitotic cell cycle. Mol. Cell (1998) 2:65–73.[CrossRef][Web of Science][Medline]

  2. Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B. Comprehensive identification of cell cycle-regulated genes of the yeast S. cerevisiae by microarray hybridization. Mol. Biol. Cell (1998) 9:3273–3297.[Abstract/Free Full Text]

  3. de Lichtenberg U, Wernersson R, Jensen TS, Nielsen HB, Fausbøll A, Schmidt P, Hansen FB, Knudsen S, Brunak S. New weakly expressed cell cycle-regulated genes in yeast. Yeast (2005) 22:1191–1201.[CrossRef][Web of Science][Medline]

  4. Pramila T, Wu W, Miles S, Noble WS, Breeden LL. The forkhead transcription factor Hcm1 regulates chromosome segregation genes and fills the S-phase gap in the transcriptional circuitry of the cell cycle. Genes Dev. (2006) 20:2266–2278.[Abstract/Free Full Text]

  5. Rustici G, Mata J, Kivinen K, Lió P, Penkett CJ, Burns G, Hayles J, Nurse P, et al. Periodic gene expression program of the fission yeast cell cycle. Nature Genet (2004) 36:809–817.[CrossRef][Web of Science][Medline]

  6. Peng X, Karuturi RK, Miller LD, Lin K, Jia Y, Kondu P, Wang L, Wong L, Liu ET, et al. Identification of cell cycle-regulated genes in fission yeast. Mol. Biol. Cell (2005) 16:1026–1042.[Abstract/Free Full Text]

  7. Oliva A, Rosebrock A, Ferrezuelo F, Pyne S, Chen H, Skiena S, Futcher B, Leatherwood J. The cell cycle-regulated genes of Schizosaccharomyces pombe. PLoS Biol. (2005) 3:e225.[CrossRef][Medline]

  8. Whitfield ML, Sherlock G, Saldanha AJ, Murray JI, Ball CA, Alexander KE, Matese JC, Perou C, Hurt MM, et al. Identification of genes periodically expressed in the human cell cycle and their expression in tumors. Mol. Biol. Cell (2002) 13. 1977–2000.

  9. Menges M, Hennig L, Gruissem W, Murray JAH. Genome-wide gene expression in an Arabidopsis cell suspension. Plant Mol. Biol. (2003) 53:423–442.[CrossRef][Web of Science][Medline]

  10. Zhao LP, Prentice R, Breeden L. Statistical modeling of large microarray data sets to identify stimulus-response profiles. Proc. Natl Acad. Sci. USA (2001) 98:5631–5636.[Abstract/Free Full Text]

  11. Johansson D, Lindgren P, Berglund A. A multivariate approach applied to microarray data for identification of genes with cell cycle-coupled transcription. Bioinformatics (2003) 19:467–473.[Abstract/Free Full Text]

  12. Langmead C, Yan T, McClung CR, Donald BR. Phase-independent rhythmic analysis of genome-wide expression patterns. J. Comput. Biol. (2003) 10:521–536.[CrossRef][Web of Science][Medline]

  13. Lu X, Zhang W, Qin ZS, Kwast KE, Liu JS. Statistical resyncronization and Bayesian detection of periodically expressed genes. Nucleic Acids Res. (2004) 32:447–455.[Abstract/Free Full Text]

  14. Wichert S, Fokianos K, Strimmer K. Identifying periodically expressed transcripts in microarray time series data. Bioinformatics (2004) 20:5–20.[Abstract/Free Full Text]

  15. Luan Y, Li H. Model-based methods for identifying periodically expressed genes based on time course microarray gene expression data. Bioinformatics (2004) 20:332–339.[Abstract/Free Full Text]

  16. de Lichtenberg U, Jensen LJ, Fausbøl A, Jensen TS, Bork P, Brunak S. Comparison of computational methods for the identification of cell cycle regulated genes. Bioinformatics (2005) 21:1164–1171.[Abstract/Free Full Text]

  17. Ahdesmaki M, Lahdesmaki H, Pearson R, Huttunen H, Yli-Harja O. Robust detection of periodic time series measured from biological systems. BMC Bioinformatics (2005) 6:117.[CrossRef][Medline]

  18. Chen J. Identification of significant periodic genes in microarray gene expression data. BMC Bioinformatics (2005) 6:286.[CrossRef][Medline]

  19. Willbrand K, Radvanyi F, Nadal J-P, Thiery J-P, Fink TMA. Identifying genes from up-down properties of microarray expression series. Bioinformatics (2005) 21:3859–3864.[Abstract/Free Full Text]

  20. Qiu P, Wang ZJ, Liu KJR. Tracking the herd: resynchronization analysis of cell-cycle gene expression d ata in Saccharomyces cerevisiae. Conf. Proc. IEEE Eng. Med. Biol. Soc. (2005) 5:4826–4829.[Medline]

  21. Qiu P, Wang ZJ, Liu KJR. Polynomial model approach for resynchronization analysis of cell-cycle gene expression data. Bioinformatics (2006) 22. 959–966.

  22. Ahnert SE, Willbrand K, Brown FCS, Fink TMA. Unbiased pattern detection in microarray data series. Bioinformatics (2006) 22:1471–1476.[Abstract/Free Full Text]

  23. Andersson CR, Isaksson A, Gustafsson MG. Bayesian detection of periodic mRNA time profiles without use of training examples. BMC Bioinformatics (2006) 7:63.[CrossRef][Medline]

  24. Glynn EF, Chen J, Mushegian AR. Detecting periodic patterns in unevenly spaced gene expression time series using Lomb-Scargle periodograms. Bioinformatics (2006) 22:310–316.[Abstract/Free Full Text]

  25. Lu Y, Rosenfeld R, Bar-Joseph Z. Identifying cycling genes by combining sequence homology and expression data. Bioinformatics (2006) 22:e314–e322.[Abstract/Free Full Text]

  26. Xu H, Sung W-K, Feng L. PEM: a general statistical approach for identifying differentially expressed genes in time-course cDNA microarray experiment without replicates. (2006) Proc. IEEE Computer Society Bioinformatics Conference. 123–132.

  27. Liew AW-C, Xian J, Wu S, Smith D, Yan H. Spectral estimation in unevenly sampled space of periodically expressed microarray time series data. BMC Bioinformatics (2007) 8:137.[CrossRef][Medline]

  28. Lu Y, Mahony S, Benos PV, Rosenfeld R, Simon I, Breeden LL, Bar-Joseph Z. Combined analysis reveals a core set of cycling genes. Genome Biol. (2007) 8:R146.[CrossRef][Medline]

  29. Marguerat S, Jensen TS, de Lichtenberg U, Wilhelm BT, Jensen LJ, Bähler J. The more the merrier: comparative analysis of microarray studies on cell cycle-regulated genes in fission yeast. Yeast (2006) 23:261–277.[CrossRef][Web of Science][Medline]

  30. Jensen LJ, Jensen TS, de Lichtenberg U, Brunak S, Bork P. Coevolution of transcriptional and post-translational cell-cycle regulation. Nature (2006) 443:594–597.[Medline]

  31. Nash R, Weng S, Hitz B, Balakrishnan R, Christie KR, Costanzo MC, Dwight SS, Engel S, Fisk DG, et al. Saccharomyces Genome Database (SGD) provides tools to identify and analyze sequences from Saccharomyces cerevisiae and related sequences from other organisms. Nucleic Acids Res. (2007) 35:D468–D471.[Abstract/Free Full Text]

  32. Hertz-Fowler C, Peacock CS, Wood V, Aslett M, Kerhornou A, Mooney P, Tivey A, Hall N, et al. GeneDB: a resource for prokaryotic and eukaryotic organisms. Nucleic Acids Res. (2004) 32:D339–D343.[Abstract/Free Full Text]

  33. Hubbard TJP, Aken BL, Beal K, Ballester B, Caccamo M, Chen Y, Clarke L, Cunningham F, et al. Ensembl 2007. Nucleic Acids Res. (2007) 35:D610–D617.[Abstract/Free Full Text]

  34. Rhee SY, Beavis W, Berardini TZ, Chen G, Dixon D, Doyle A, Garcia-Hernandez M, Lander G, et al. The Arabidopsis Information Resource (TAIR): a model organism database providing a centralized, curated gateway to Arabidopsis biology, research materials and community. Nucleic Acids Res. (2003) 31:224–228.[Abstract/Free Full Text]

  35. The UniProt Consortium. The universal protein resource (UniProt). Nucleic Acids Res (2007) 35:D193–D197.[Abstract/Free Full Text]


Add to CiteULike CiteULike   Add to Connotea Connotea   Add to Del.icio.us Del.icio.us    What's this?


This article has been cited by other articles:


Home page
Mol. Biol. CellHome page
P. Cote, H. Hogues, and M. Whiteway
Transcriptional Analysis of the Candida albicans Cell Cycle
Mol. Biol. Cell, July 15, 2009; 20(14): 3363 - 3373.
[Abstract] [Full Text] [PDF]


Home page
Brief BioinformHome page
A. Csikasz-Nagy
Computational systems biology of the cell cycle
Brief Bioinform, July 1, 2009; 10(4): 424 - 434.
[Abstract] [Full Text] [PDF]


Home page
BioinformaticsHome page
J. D. Wren
A global meta-analysis of microarray expression data to predict unknown gene functions and estimate the literature-data divide
Bioinformatics, July 1, 2009; 25(13): 1694 - 1701.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Print PDF (1016K) Freely available
Right arrow Screen PDF (374K) Freely available
Right arrowOA All Versions of this Article:
36/suppl_1/D854    most recent
gkm729v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Add to My Personal Archive
Right arrow Download to citation manager
Right arrow Commercial Re-use Guidelines
for Open Access NAR Content
Google Scholar
Right arrow Articles by Gauthier, N. P.
Right arrow Articles by Jensen, T. S.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Gauthier, N. P.
Right arrow Articles by Jensen, T. S.
Social Bookmarking
 Add to CiteULike   Add to Connotea   Add to Del.icio.us  
What's this?