Nucleic Acids Research Advance Access originally published online on November 11, 2006
Nucleic Acids Research 2007 35(Database issue):D760-D765; doi:10.1093/nar/gkl887
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2007, Vol. 35, Database issue D760-D765
Published by Oxford University Press 2006
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Articles |
NCBI GEO: mining tens of millions of expression profilesdatabase and tools update
National Center for Biotechnology Information, National Library of Medicine, National Institutes of Health 45 Center Drive, Bethesda, MD 20892, USA
*To whom correspondence should be addressed. Tel: +1 301 402 4057; Fax: +1 301 480 0109; Email: barrett{at}ncbi.nlm.nih.gov
Received September 15, 2006. Accepted October 9, 2006.
| ABSTRACT |
|---|
|
|
|---|
The Gene Expression Omnibus (GEO) repository at the National Center for Biotechnology Information (NCBI) archives and freely disseminates microarray and other forms of high-throughput data generated by the scientific community. The database has a minimum information about a microarray experiment (MIAME)-compliant infrastructure that captures fully annotated raw and processed data. Several data deposit options and formats are supported, including web forms, spreadsheets, XML and Simple Omnibus Format in Text (SOFT). In addition to data storage, a collection of user-friendly web-based interfaces and applications are available to help users effectively explore, visualize and download the thousands of experiments and tens of millions of gene expression patterns stored in GEO. This paper provides a summary of the GEO database structure and user facilities, and describes recent enhancements to database design, performance, submission format options, data query and retrieval utilities. GEO is accessible at http://www.ncbi.nlm.nih.gov/geo/
| INTRODUCTION |
|---|
|
|
|---|
Microarray and other high-throughput technologies have led to an explosion in the rate of molecular abundance data generated in the last decade. For the last seven years the Gene Expression Omnibus (GEO) database has served as a central hub for these data, operating primarily as a public archive and distribution center, but also providing flexible mining tools that enable users to easily query, filter, inspect and download data in the context of their specific interests (1,2).
GEO is currently the largest fully public gene expression resource. Since its inception, the database has grown exponentially each year. As of September 2006, the database holds over 120 000 samples, representing over 3.2 billion individual measurements, spanning over 200 organisms, and addressing a wide variety of biological phenomena. These data have been deposited by >2000 laboratories from around the world. All data are freely available online and via bulk FTP download.
GEO supports minimum information about a microarray experiment (MIAME)-compliant data submissions. MIAME is a data content standard developed by the microarray gene expression data (MGED) society to outline what information should be provided when describing a microarray experiment (3). Making microarray data public in a MIAME-compliant manner has become a precondition for publication for many journals. Publishing original data and protocols facilitates independent evaluation of results and reanalysis, and is in keeping with the spirit of open-access (4). Consequently, most of the data in GEO have been submitted by the research community in fulfillment of journal requirements.
| DATABASE STRUCTURE AND DATA FLOW |
|---|
|
|
|---|
The GEO database architecture is designed for the efficient capture, storage and retrieval of large-scale functional genomic data. The diverse and complex nature of such data presents considerable challenges in data handling and querying. There are many different types of high-throughput methodologies and researchers use a wide variety of hardware and software to generate and process data. Thus, data come in many different formats and comprise varying content. Furthermore, technologies and processing strategies continue to rapidly evolve. In light of these considerations, GEO was designed with a flexible structure that can accommodate diverse styles of data. This flexibility is largely attributed to the fact that tabular data are not fully granulated in the core database but instead are treated as plain text, tab-delimited tables that may contain any number of rows or columns. Although the primary database has no knowledge and applies no restrictions on these tab-delimited tables, some columns reserve special meanings and data from selected fields are extracted to secondary databases and used in downstream query and analysis applications. Accompanying supplementary and native file types are linked from each record and stored on an FTP server.
Expression data can be rendered meaningless unless accompanied by the contextual biological and processing details under which they were generated. To address this, GEO has a MIAME-compliant infrastructure that supports fully annotated records. Biological and other descriptive metadata are stored in designated fields with proper relations or restrictions within database tables.
Submitter-supplied data
The overall structure of the core GEO database remains as described previously (1,2). Briefly, data submitted to GEO are stored in a relational MSSQL database partitioned into three entity types:
Platform
Includes a summary description of the array and a data table defining the array template. Each row in the table corresponds to a single feature, and includes sequence annotation and tracking information as provided by the submitter. The table may contain any number of columns allowing thorough annotation of the array.
Sample
Includes a description of the biological material and the experimental protocols to which it was subjected, and a data table containing hybridization measurements for each feature on the corresponding platform. The table may contain any number of columns in which to comprehensively present hybridization results. The metadata fields may hold very large volumes of text to allow elaborate descriptions of the biological source and protocols.
Series
Defines a set of related samples considered to be part of a study, and describes the overall study aim and design. Series may also incorporate tabular summary tables pertaining to the experiment as a whole.
Each of these objects is essentially under the submitter's editorial control and is assigned a stable and unique accession number that may be used to cite and retrieve the records. The accession consists of a number and a letter prefix indicating whether the record is a GEO Platform (GPL), GEO Sample (GSM), or GEO Series (GSE).
In addition to the user-submitted objects described above, GEO defines and creates a number of related data objects to facilitate data mining, visual rendering and transposition of submitted data into alternative structures. The principal object used for this purpose is the DataSet object.
GEO DataSets
Despite the variety of style and content of the data received, submissions have a common core set of elements:
- sequence identity tracking information of each feature on the array
- normalized hybridization measurements
- a description of the biological source used in each hybridization
DataSets provide two discrete renderings of the data (Figure 1):
- An experiment-centered representation that encapsulates the entire study. This information is presented as a DataSet record which comprises a synopsis of the experiment, a breakdown of the experimental variables, access to auxiliary objects, several data display and analysis tools, and download options.
- A gene-centered representation that presents quantitative gene expression measurements for one gene across a DataSet. This information is presented as a GEO Profile which comprises gene identity annotation, DataSet title, links to auxiliary information and a chart depicting the expression level and rank of that gene across each sample in the DataSet. Gene annotation is derived from querying sequence identifiers (e.g. GenBank accessions, clone IDs) with the latest Entrez Gene and UniGene databases, an important point given the dynamic nature of gene annotation.
|
| SUBMISSION PROCEDURES, FORMATS AND STANDARDS |
|---|
|
|
|---|
We endeavor to make data deposit procedures as straightforward as possible. Submitters have several options for data submission; selecting which method to use depends on the amount and type of data to be submitted, and what format the data are already in. Regardless of the deposit method chosen, the final GEO records will look similar and contain equivalent information. Each format captures all components of the MIAME checklist, as well as any additional information that the submitter wants to provide.
Upload options and formats
Web deposit
The web submission process is designed for quick and easy deposit of individual records by occasional submitters, or for smaller experiments. This route consists of a set of interactive web forms that provide a simple step-by-step procedure for deposit of data tables and accompanying descriptive information.
SOFT format
Simple Omnibus Format in Text (SOFT) is a simple, line-based, tab-delimited format designed for rapid batch deposit. A single SOFT file can hold both data tables and accompanying descriptive information for multiple platforms, samples and series records. The simplicity of SOFT allows it to be readily generated from commonly-used database and spreadsheet applications. Conveniently, two versions of SOFT are available:
SOFTtext
SOFT-formatted data are organized as concatenated records.
SOFTmatrix
SOFT-formatted data are organized side-by-side as a matrix table, usually in an Excel spreadsheet.
MINiML format
MIAME Notation in Markup Language, (MINiML, pronounced minimal) is a recent addition to GEO's upload/download options. MINiML is effectively an XML rendering of SOFT format, and is similarly designed for rapid batch submission and upload of data. The MINiML XML schema definition and a detailed description are available at the GEO website.
MAGE-ML format
MicroArray Gene Expression Markup Language (MAGE-ML) is an XML format devised by the MGED consortium (5) and a direct derivation from the corresponding MAGE object model. GEO is not based on the MAGE object model and cannot receive these files directly. Nonetheless, parsers have been written to extract data from some of the various flavors of MAGE-ML and reformat according to GEO schema. It is worth noting here that having data formatted as MAGE-ML does not in any way imply MIAME-compliance. MIAME is a data content standard, not a format standard. MIAME-compliant data may be submitted in many formats.
Detailed documentation and examples of submission options and formats are available on the GEO website. However, if submitters have questions or require assistance with submission procedures they are encouraged to contact GEO curation staff at geo{at}ncbi.nlm.nih.gov for prompt support.
Submitters may keep their records private until a manuscript describing the data is published. Submitters may generate read-only passwords that give reviewers and collaborators confidential access to their private data.
Most researchers submit to GEO to support data discussed in a journal manuscript, so it is important to present the data as it was processed in the manuscript. However, over the past two years, greater emphasis has been placed on provision of raw, unmanipulated native data files to accompany the processed data within GEO records. Such files include, e.g. Affymetrix CEL or GenePix GPR scan files. Recent modifications to submission procedures now make it more convenient for submitters to supply these raw files: the web deposit route specifically requests supplementary files; the batch deposit routes allow for raw data files to be zipped/tarred together with bulk submissions. Provision of raw data not only enables other researchers to faithfully reproduce the data selection, transformation and analysis steps that are the basis of a publication, but also maximizes the long-term value of submissions, enabling recycling of the data into repeated rounds of analysis.
All submitted data undergo syntactic validation and are inspected by curators for content integrity. When content or format problems are identified, curators work with the submitter until the issue is resolved. However, given the huge diversity of biological themes, technology types, processing techniques, and statistical transformations applied to microarray data, it is impractical for curators to decisively determine the accuracy, validity or score the degree of MIAME-compliance of submitted data. Thus, researchers are ultimately responsible for the completeness, quality and accuracy of their submissions. This validation process can benefit from feedback by journal editorial reviewers or funding agency enforcement. Through their GEO accounts, researchers retain full editorial control of their records and can update or edit their records at any time.
In addition to satisfying possible journal requirements for publication, there are other significant benefits to depositing data with GEO. Data receive long term archiving at a centralized repository, integration with other NCBI resources which afford greatly increased usability and visibility, as well as possible links back to submitters' own project websites.
| TOOLS TO RETRIEVE, EXPLORE AND VIZUALIZE DATA |
|---|
|
|
|---|
To maximize the utility and value of the massive volumes of data in GEO, a selection of intuitive tools and features has been developed to assist researchers to quickly locate, analyze and visualize data relevant to their interests. These features incorporate traditional data reduction techniques and concise displays designed for human scanning, helping the user identify and categorize gene and sample relationships. Figure 2 depicts a schematic overview of the query workflow and how the various features and tools are interlinked. A summary of where the main features are located and their purpose is provided in Table 1. Query approaches include standard text-based searches, sequence-based searches, mining based on expression behavior characteristics or combinations of these factors.
|
|
These tools do not require specialized knowledge of microarray analysis methods, nor do they require time-consuming download or processing of large data sets. However, it should be stated that the analysis features are not primarily intended for robust systematic data mining. The diverse nature of the data in GEO restricts to some extent the statistical tools that can be developed. All data are treated similarly; criteria such as scaling factors, filter parameters, and number of repeats are not considered. Despite these issues, these tools are extremely useful for quick and easy identification of relevant and noteworthy data.
NCBI's Entrez search system serves as the basis for most queries. Entrez GEO DataSets contains experiment-centered data and Entrez GEO Profiles contains gene-centered data. Most biologists are familiar with Entrez, using it routinely to search other NCBI databases like PubMed and GenBank (6,7). It has a straightforward interface where users can locate relevant material by simply typing in keywords or Boolean phrases restricted to supported attribute fields. Examples of typical queries and query fields are provided at http://www.ncbi.nlm.nih.gov/projects/geo/info/qqtutorial.html.
Full use is made of Entrez's powerful linking capabilities. Intra-database links connect genes related by expression pattern or sequence. Where possible, reciprocal inter-database links connect GEO data with related data in other NCBI resources such as PubMed, GenBank, Gene, UniGene, MapViewer, OMIM and others. Advanced Entrez features allow generation of complex multipart queries or combination of multiple queries that find common intersections in retrievals. GEO's Entrez query facilities were recently further enhanced by implementation of a spell-check function, as well as automatic term mapping using MeSH translation tables.
Graphics are an important tool to aid visualization and interpretation of high-dimensional expression data. The expression pattern of each gene within a DataSet is represented as a profile chart (Figure 1D). A breakdown of the experimental design is provided along the bottom of the chart, helping the user to quickly assess whether expression levels are shifting with experimental variables. Thumbnail chart images provided on batch profile retrievals are useful for rapid batch profile scanning and comparison. Value distribution charts are provided on DataSets records, providing at-a-glance indication of how well normalized the data are within a DataSet. Precomputed interactive hierarchical cluster heat map images are available on each DataSet record, providing suggestions for groups of coordinately regulated genes within entire DataSets.
Within the last year, the back-end structure of the Profiles, DataSets and annotation databases was completely redesigned. These changes allow more flexibility on the front-end user interfaces and will permit development of more advanced query, analysis and download tools, including enhanced Entrez utilities user-scripting options. These changes also help to streamline internal indexing procedures, enabling more frequent release of new DataSets and profiles.
For users who prefer to use their own analysis software or want to perform more robust analyses, all GEO data are available for bulk download via anonymous FTP at ftp://ftp.ncbi.nih.gov/pub/geo/DATA/. Files include SOFT- and MINiML-formatted Platform and Series families, SOFT-formatted DataSets and original supplementary data types. Various software packages have been developed by the community to handle GEO data formats, including the GEOquery R/BioConductor package, http://bioconductor.org/packages/release/bioc/html/GEOquery.html.
| CONCLUSIONS |
|---|
|
|
|---|
GEO currently represents the largest single resource for public gene expression data. Beyond archiving and making data freely-available for peer review and download, the GEO repository also provides an extensive complement of utilities and strategies that enable effective data mining on either a small or large scale.
The data in GEO gain value as they accumulate. Pooling masses of expression data into common formats at a single location affords researchers the opportunity to distill disparate data sets and identify common gene expression trends, dissect regulatory networks and predict functions of uncharacterized genes. Increasingly, GEO data are used and cited by third parties as evidence to support and complement their own studies, selected examples include (815).
Having GEO data cross-annotated with extensive sequence, mapping and bibliographic resources via the NCBI Entrez system of interlinked databases imparts further value and context to the data. This diverse integrated data environment leverages multiple types of information and enables traditional disciplinary boundaries to be crossed, ultimately accelerating systems-level hypothesis formation and scientific discovery.
Future plans for GEO include continued development of data retrieval and mining features, and enhancing novice user experience. We also plan to improve rendering and representation of the non-gene-expression data types that GEO accepts, which include chromatin-immunoprecipitation on arrays (ChIP-chip) studies, array comparative genomic hybridization (aCGH), SNP arrays and some proteomic data.
| ACKNOWLEDGEMENTS |
|---|
This research, and funding to pay the Open Access publication charges for this article, were supported by the Intramural Research Program of the National Institutes of Health, National Library of Medicine.
Conflict of interest statement. None declared.
| REFERENCES |
|---|
|
|
|---|
- Barrett, T., Suzek, T.O., Troup, D.B., Wilhite, S.E., Ngau, W.C., Ledoux, P., Rudnev, D., Lash, A.E., Fujibuchi, W., Edgar, R. (2005) NCBI GEO: mining millions of expression profilesdatabase and tools Nucleic Acids Res, . 33, D562D566
[Abstract/Free Full Text] . - Edgar, R., Domrachev, M., Lash, A.E. (2002) Gene expression omnibus: NCBI gene expression and hybridization array data repository Nucleic Acids Res, . 30, 207210
[Abstract/Free Full Text] . - Brazma, A., Hingamp, P., Quackenbush, J., Sherlock, G., Spellman, P., Stoeckert, C., Aach, J., Ansorge, W., Ball, C.A., Causton, H.C., et al. (2001) Minimum information about a microarray experiment (MIAME)-toward standards for microarray data Nature Genet, . 29, 365371[CrossRef][Web of Science][Medline] .
- Ball, C., Brazma, A., Causton, H., Chervitz, S., Edgar, R., Hingamp, P., Matese, J.C., Parkinson, H., Quackenbush, J., Ringwald, M., et al. (2004) Standards for microarray data: an open letter Environ. Health Perspect, . 112, A666A667[Web of Science][Medline] .
- Spellman, P.T., Miller, M., Stewart, J., Troup, C., Sarkans, U., Chervitz, S., Bernhart, D., Sherlock, G., Ball, C., Lepage, M., et al. (2002) Design and implementation of microarray gene expression markup language (MAGE-ML) Genome Biol, . 3, RESEARCH0046 .
- Wheeler, D.L., Barrett, T., Benson, D.A., Bryant, S.H., Canese, K., Chetvernin, V., Church, D.M., DiCuccio, M., Edgar, R., Federhen, S., et al. (2006) Database resources of the National Center for Biotechnology Information Nucleic Acids Res, . 34, D173D180
[Abstract/Free Full Text] . - Schuler, G.D., Epstein, J.A., Ohkawa, H., Kans, J.A. (1996) Entrez: molecular biology database and retrieval system Methods Enzymol, . 266, 141162[Web of Science][Medline] .
- Yuan, Z., Tie, A., Tarnopolsky, M., Bakovic, M. (2006) Genomic organization, promoter activity, and expression of the human choline transporter-like protein 1 Physiol. Genomics, 26, 7690
[Abstract/Free Full Text] . - Byrnes, J.K., Morris, G.P., Li, W.H. (2006) Reorganization of adjacent gene relationships in yeast genomes by whole-genome duplication and gene deletion Mol. Biol. Evol, . 23, 11361143
[Abstract/Free Full Text] . - Siddiqui, A.S., Delaney, A.D., Schnerch, A., Griffith, O.L., Jones, S.J., Marra, M.A. (2006) Sequence biases in large scale gene expression profiling data Nucleic Acids Res, . 34, e83
[Abstract/Free Full Text] . - Norris, A.W. and Kahn, C.R. (2006) Analysis of gene expression in pathophysiological states: balancing false discovery and false negative rates Proc. Natl Acad. Sci. USA, 103, 649653
[Abstract/Free Full Text] . - Graham, R.R., Kozyrev, S.V., Baechler, E.C., Reddy, M.V., Plenge, R.M., Bauer, J.W., Ortmann, W.A., Koeuth, T., Gonzalez Escribano, M.F., Pons-Estel, B., et al. (2006) A common haplotype of interferon regulatory factor 5 (IRF5) regulates splicing and expression and is associated with increased risk of systemic lupus erythematosus Nature Genet, . 38, 550555[CrossRef][Web of Science][Medline] .
- Calvo, S., Jain, M., Xie, X., Sheth, S.A., Chang, B., Goldberger, O.A., Spinazzola, A., Zeviani, M., Carr, S.A., Mootha, V.K. (2006) Systematic identification of human mitochondrial disease genes through integrative genomics Nature Genet, . 38, 576582[CrossRef][Web of Science][Medline] .
- Zhou, X.J., Kao, M.C., Huang, H., Wong, A., Nunez-Iglesias, J., Primig, M., Aparicio, O.M., Finch, C.E., Morgan, T.E., Wong, W.H. (2005) Functional annotation and network reconstruction through cross-platform integration of microarray data Nat. Biotechnol, . 23, 238243[CrossRef][Web of Science][Medline] .
- Griffith, O.L., Pleasance, E.D., Fulton, D.L., Oveisi, M., Ester, M., Siddiqui, A.S., Jones, S.J. (2005) Assessment and integration of publicly available SAGE, cDNA microarray, and oligonucleotide microarray expression data for global coexpression analyses Genomics, 86, 476488[CrossRef][Web of Science][Medline] .
- Gonzalez, R., Yang, Y.H., Griffin, C., Allen, L., Tigue, Z., Dobbs, L. (2005) Freshly isolated rat alveolar type I cells, type II cells, and cultured type II cells have distinct molecular phenotypes Am. J. Physiol. Lung Cell Mol. Physiol, . 288, L179L189
[Abstract/Free Full Text] .
This article has been cited by other articles:
![]() |
N. L. Barbosa-Morais, M. J. Dunning, S. A. Samarajiwa, J. F. J. Darot, M. E. Ritchie, A. G. Lynch, and S. Tavare A re-annotation pipeline for Illumina BeadArrays: improving the interpretation of gene expression data Nucleic Acids Res., November 18, 2009; (2009) gkp942v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Beck, C. Leroy, C. Salaun, G. Margall-Ducos, C. Desdouets, and G. Friedlander Identification of a Novel Function of PiT1 Critical for Cell Proliferation and Independent of Its Phosphate Transport Activity J. Biol. Chem., November 6, 2009; 284(45): 31363 - 31374. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Kayano, I. Takigawa, M. Shiga, K. Tsuda, and H. Mamitsuka Efficiently finding genome-wide three-way gene interactions from transcript- and genotype-data Bioinformatics, November 1, 2009; 25(21): 2735 - 2743. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Pollard, M. Nitz, A. Baras, P. Williams, C. Moskaluk, and D. Theodorescu Genoproteomic Mining of Urothelial Cancer Suggests {gamma}-Glutamyl Hydrolase and Diazepam-Binding Inhibitor as Putative Urinary Markers of Outcome after Chemotherapy Am. J. Pathol., November 1, 2009; 175(5): 1824 - 1830. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. N. Jenne, A. Enders, R. Rivera, S. R. Watson, A. J. Bankovich, J. P. Pereira, Y. Xu, C. M. Roots, J. N. Beilke, A. Banerjee, et al. T-bet-dependent S1P5 expression in NK cells promotes egress from lymph nodes and bone marrow J. Exp. Med., October 26, 2009; 206(11): 2469 - 2481. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Kinoshita and T. Obayashi Multi-dimensional correlations for gene coexpression and application to the large-scale data of Arabidopsis Bioinformatics, October 15, 2009; 25(20): 2677 - 2684. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Houot, M. J. Goldstein, H. E. Kohrt, J. H. Myklebust, A. A. Alizadeh, J. T. Lin, J. M. Irish, J. A. Torchia, A. Kolstad, L. Chen, et al. Therapeutic effect of CD137 immunomodulation in lymphoma and its enhancement by Treg depletion Blood, October 15, 2009; 114(16): 3431 - 3438. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. R. Ansari, D. R. Flower, and G. P. S. Raghava AntigenDB: an immunoinformatics database of pathogen antigens Nucleic Acids Res., October 9, 2009; (2009) gkp830v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. S. Baerends, J.-L. Qiu, S. Rasmussen, H. B. Nielsen, and A. Brandt Impaired Uptake and/or Utilization of Leucine by Saccharomyces cerevisiae Is Suppressed by the SPT15-300 Allele of the TATA-Binding Protein Gene Appl. Envir. Microbiol., October 1, 2009; 75(19): 6055 - 6061. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Obayashi and K. Kinoshita Rank of Correlation Coefficient as a Comparable Measure for Biological Significance of Gene Coexpression DNA Res, October 1, 2009; 16(5): 249 - 260. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. Huttenhower, M. A. Hibbs, C. L. Myers, A. A. Caudy, D. C. Hess, and O. G. Troyanskaya The impact of incomplete knowledge on evaluation: an experimental benchmark for protein function prediction Bioinformatics, September 15, 2009; 25(18): 2404 - 2410. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. T. Judy and H. Ji TileProbe: modeling tiling array probe effects using publicly available data Bioinformatics, September 15, 2009; 25(18): 2369 - 2375. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. K. Auerbach, G. Euskirchen, J. Rozowsky, N. Lamarre-Vincent, Z. Moqtaderi, P. Lefrancois, K. Struhl, M. Gerstein, and M. Snyder Mapping accessible chromatin regions using Sono-Seq PNAS, September 1, 2009; 106(35): 14926 - 14931. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Li, X. Xing, G. Ding, Q. Li, C. Wang, L. Xie, R. Zeng, and Y. Li SysPTM: A Systematic Resource for Proteomic Research on Post-translational Modifications Mol. Cell. Proteomics, August 1, 2009; 8(8): 1839 - 1849. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. E. McKellar and C. J. Shatz Synaptogenesis in Purified Cortical Subplate Neurons Cereb Cortex, August 1, 2009; 19(8): 1723 - 1737. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. C. Cheung, T. Hai, W. Zhu, K. A. Baggerly, S. Tsavachidis, R. Krahe, and G. J. Cote Splicing factors PTBP1 and PTBP2 promote proliferation and migration of glioma cell lines Brain, August 1, 2009; 132(8): 2277 - 2288. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.-Y. Jiang, A. Christoffels, R. Ramamoorthy, and S. Ramachandran Expansion Mechanisms and Functional Annotations of Hypothetical Genes in the Rice Genome Plant Physiology, August 1, 2009; 150(4): 1997 - 2008. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Xu, W. Li, G. M. James, M. R. Mehan, and X. J. Zhou Automated multidimensional phenotypic profiling using large public microarray repositories PNAS, July 28, 2009; 106(30): 12323 - 12328. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. I. Graham, S. Hunt, S. L. Stokes, N. Bramall, J. Bunch, A. G. Cox, C. W. McLeod, and R. K. Poole Severe Zinc Depletion of Escherichia coli: ROLES FOR HIGH AFFINITY ZINC BINDING BY ZinT, ZINC TRANSPORT AND ZINC-INDEPENDENT PROTEINS J. Biol. Chem., July 3, 2009; 284(27): 18377 - 18389. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. D. Wren A global meta-analysis of microarray expression data to predict unknown gene functions and estimate the literature-data divide Bioinformatics, July 1, 2009; 25(13): 1694 - 1701. [Abstract] [Full Text] [PDF] |
||||
![]() |
N. Mitsuda and M. Ohme-Takagi Functional Analysis of Transcription Factors in Arabidopsis Plant Cell Physiol., July 1, 2009; 50(7): 1232 - 1248. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Krushevskaya, H. Peterson, J. Reimand, M. Kull, and J. Vilo VisHiC--hierarchical functional enrichment analysis of microarray data Nucleic Acids Res., July 1, 2009; 37(suppl_2): W587 - W592. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. French, S. Lane, T. Law, L. Xu, and P. Pavlidis Application and evaluation of automated semantic annotation of gene expression experiments Bioinformatics, June 15, 2009; 25(12): 1543 - 1549. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. R. Williams-Devane, M. A. Wolf, and A. M. Richard Toward a Public Toxicogenomics Capability for Supporting Predictive Toxicology: Survey of Current Resources and Chemical Indexing of Experiments in GEO and ArrayExpress Toxicol. Sci., June 1, 2009; 109(2): 358 - 371. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y.-S. Lee, C.-H. Chen, C.-N. Tsai, C.-L. Tsai, A. Chao, and T.-H. Wang Microarray labeling extension values: laboratory signatures for Affymetrix GeneChips Nucleic Acids Res., May 1, 2009; 37(8): e61 - e61. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. J. G. Upton, O. Sanchez-Graillet, J. Rowsell, J. M. Arteaga-Salas, N. S. Graham, M. A. Stalteri, F. N. Memon, S. T. May, and A. P. Harrison On the causes of outliers in Affymetrix GeneChip data Brief Funct Genomic Proteomic, May 1, 2009; 8(3): 199 - 212. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Tsuchihara, Y. Suzuki, H. Wakaguri, T. Irie, K. Tanimoto, S.-i. Hashimoto, K. Matsushima, J. Mizushima-Sugano, R. Yamashita, K. Nakai, et al. Massive transcriptional start site analysis of human genes in hypoxia cells Nucleic Acids Res., April 1, 2009; 37(7): 2249 - 2263. [Abstract] [Full Text] [PDF] |
||||
![]() |
G. Zheng, H. Li, C. Wang, Q. Sheng, H. Fan, S. Yang, B. Liu, J. Dai, R. Zeng, and L. Xie A platform to standardize, store, and visualize proteomics experimental data Acta Biochim Biophys Sin, April 1, 2009; 41(4): 273 - 279. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. P. de Magalhaes, J. Curado, and G. M. Church Meta-analysis of age-related gene expression profiles identifies common signatures of aging Bioinformatics, April 1, 2009; 25(7): 875 - 881. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Li, P. Wadia, R. Chen, N. Kambham, M. Naesens, T. K. Sigdel, D. B. Miklos, M. M. Sarwal, and A. J. Butte Identifying compartment-specific non-HLA targets after renal transplantation by integrating transcriptome and "antibodyome" measures PNAS, March 17, 2009; 106(11): 4148 - 4153. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Sun, J. Luo, Y. Zhou, J. Luo, K. Liu, and W. Li Exploring phenotype-associated modules in an oral cavity tumor using an integrated framework Bioinformatics, March 15, 2009; 25(6): 795 - 800. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. A. Gennarino, M. Sardiello, R. Avellino, N. Meola, V. Maselli, S. Anand, L. Cutillo, A. Ballabio, and S. Banfi MicroRNA target prediction by expression analysis of host genes Genome Res., March 1, 2009; 19(3): 481 - 490. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. R. Kumar, Q. Li, W. A. Hudson, W. Chen, T. Sam, Q. Yao, E. A. Lund, B. Wu, B. J. Kowal, and J. H. Kersey A role for MEIS1 in MLL-fusion gene leukemia Blood, February 19, 2009; 113(8): 1756 - 1758. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. J. Giordano, R. Kuick, T. Else, P. G. Gauger, M. Vinco, J. Bauersfeld, D. Sanders, D. G. Thomas, G. Doherty, and G. Hammer Molecular Classification and Prognostication of Adrenocortical Tumors by Transcriptome Profiling Clin. Cancer Res., January 15, 2009; 15(2): 668 - 676. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Lukes, N. P.S. Crawford, R. Walker, and K. W. Hunter The Origins of Breast Cancer Prognostic Gene Expression Profiles Cancer Res., January 1, 2009; 69(1): 310 - 318. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. Portales-Casamar, D. Arenillas, J. Lim, M. I. Swanson, S. Jiang, A. McCallum, S. Kirov, and W. W. Wasserman The PAZAR database of gene regulatory information coupled to the ORCA toolkit for the study of regulatory sequences Nucleic Acids Res., January 1, 2009; 37(suppl_1): D54 - D60. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. Bulow, S. Engelmann, M. Schindler, and R. Hehl AthaMap, integrating transcriptional and post-transcriptional data Nucleic Acids Res., January 1, 2009; 37(suppl_1): D983 - D986. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Barrett, D. B. Troup, S. E. Wilhite, P. Ledoux, D. Rudnev, C. Evangelista, I. F. Kim, A. Soboleva, M. Tomashevsky, K. A. Marshall, et al. NCBI GEO: archive for high-throughput functional genomic data Nucleic Acids Res., January 1, 2009; 37(suppl_1): D885 - D890. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Obayashi, S. Hayashi, M. Saeki, H. Ohta, and K. Kinoshita ATTED-II provides coexpressed gene networks for Arabidopsis Nucleic Acids Res., January 1, 2009; 37(suppl_1): D987 - D991. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Parkinson, M. Kapushesky, N. Kolesnikov, G. Rustici, M. Shojatalab, N. Abeygunawardena, H. Berube, M. Dylag, I. Emam, A. Farne, et al. ArrayExpress update--from an archive of functional genomics experiments to the atlas of gene expression Nucleic Acids Res., January 1, 2009; 37(suppl_1): D868 - D872. [Abstract] [Full Text] [PDF] |
||||
![]() |
H.-Y. Huang, H.-Y. Chang, C.-H. Chou, C.-P. Tseng, S.-Y. Ho, C.-D. Yang, Y.-W. Ju, and H.-D. Huang sRNAMap: genomic maps for small non-coding RNAs, their regulators and their targets in microbial genomes Nucleic Acids Res., January 1, 2009; 37(suppl_1): D150 - D154. [Abstract] [Full Text] [PDF] |
||||
![]() |
C. T. Fulp, G. Cho, E. D. Marsh, I. M. Nasrallah, P. A. Labosky, and J. A. Golden Identification of Arx transcriptional targets in the developing basal forebrain Hum. Mol. Genet., December 1, 2008; 17(23): 3740 - 3760. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Zhu, S. Davis, R. Stephens, P. S. Meltzer, and Y. Chen GEOmetadb: powerful alternative search engine for the Gene Expression Omnibus Bioinformatics, December 1, 2008; 24(23): 2798 - 2800. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Lemuth, T. Hardiman, S. Winter, D. Pfeiffer, M. A. Keller, S. Lange, M. Reuss, R. D. Schmid, and M. Siemann-Herzberg Global Transcription and Metabolic Flux Analysis of Escherichia coli in Glucose-Limited Fed-Batch Cultivations Appl. Envir. Microbiol., November 15, 2008; 74(22): 7002 - 7015. [Abstract] [Full Text] [PDF] |
||||
![]() |
U. D. Akavia and D. Benayahu Meta-analysis and profiling of cardiac expression modules Physiol Genomics, November 12, 2008; 35(3): 305 - 315. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. J. Butte Translational Bioinformatics: Coming of Age J. Am. Med. Inform. Assoc., November 1, 2008; 15(6): 709 - 714. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. J. Cosgrove, Y. Zhou, T. S. Gardner, and E. D. Kolaczyk Predicting gene targets of perturbations via network-based filtering of mRNA expression compendia Bioinformatics, November 1, 2008; 24(21): 2482 - 2490. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. W. Deutsch, H. Lam, and R. Aebersold Data analysis and bioinformatics tools for tandem mass spectrometry in proteomics Physiol Genomics, October 8, 2008; 33(1): 18 - 25. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. W. W. Brouwer, O. P. Kuipers, and S. A. F. T. van Hijum The relative value of operon predictions Brief Bioinform, September 1, 2008; 9(5): 367 - 375. [Abstract] [Full Text] [PDF] |
||||
![]() |
S. Tanabe, Y. Sato, T. Suzuki, K. Suzuki, T. Nagao, and T. Yamaguchi Gene Expression Profiling of Human Mesenchymal Stem Cells for Identification of Novel Markers in Early- and Late-Stage Cell Culture J. Biochem., September 1, 2008; 144(3): 399 - 408. [Abstract] [Full Text] [PDF] |
||||
![]() |
H-L Wong, W-P Koh, N M Probst-Hensch, D Van den Berg, M C Yu, and S A Ingles Insulin-like growth factor-1 promoter polymorphisms and colorectal cancer: a functional genomics approach Gut, August 1, 2008; 57(8): 1090 - 1096. [Abstract] [Full Text] [PDF] |
||||
![]() |
K. Pawlowski Uncharacterized/hypothetical proteins in biomedical 'omics' experiments: is novelty being swept under the carpet? Brief Funct Genomic Proteomic, July 19, 2008; (2008) eln033v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Chen, V. Kalscheuer, A. Tzschach, C. Menzel, R. Ullmann, M. H. Schulz, F. Erdogan, N. Li, Z. Kijas, G. Arkesteijn, et al. Mapping translocation breakpoints by next-generation sequencing Genome Res., July 1, 2008; 18(7): 1143 - 1149. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Reimand, L. Tooming, H. Peterson, P. Adler, and J. Vilo GraphWeb: mining heterogeneous biological networks for gene modules with functional significance Nucleic Acids Res., July 1, 2008; 36(suppl_2): W452 - W459. [Abstract] [Full Text] [PDF] |
||||
![]() |
A. E. Ivliev, P. A. C. t Hoen, M. P. Villerius, J. T. den Dunnen, and B. W. Brandt Microarray retriever: a web-based tool for searching and large scale retrieval of public microarray data Nucleic Acids Res., July 1, 2008; 36(suppl_2): W327 - W331. [Abstract] [Full Text] [PDF] |
||||
![]() |
V. Gotea and I. Ovcharenko DiRE: identifying distant regulatory elements of co-expressed genes Nucleic Acids Res., July 1, 2008; 36(suppl_2): W133 - W139. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. Xu, D. Geerts, K. Qian, H. Zhang, and G. Zhu Myeloid ecotropic viral integration site 1 (MEIS) 1 involvement in embryonic implantation Hum. Reprod., June 1, 2008; 23(6): 1394 - 1406. [Abstract] [Full Text] [PDF] |
||||
![]() |
M.-P. Gustin, C. Z. Paultre, J. Randon, G. Bricca, and C. Cerutti Functional meta-analysis of double connectivity in gene coexpression networks in mammals Physiol Genomics, June 1, 2008; 34(1): 34 - 41. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. J. S. Baerends, E. de Hulster, J.-M. A. Geertman, J.-M. Daran, A. J. A. van Maris, M. Veenhuis, I. J. van der Klei, and J. T. Pronk Engineering and Analysis of a Saccharomyces cerevisiae Strain That Uses Formaldehyde as an Auxiliary Substrate Appl. Envir. Microbiol., May 15, 2008; 74(10): 3182 - 3188. [Abstract] [Full Text] [PDF] |
||||
![]() |
R. Higdon, G. van Belle, and E. Kolker A note on the false discovery rate and inconsistent comparisons between experiments Bioinformatics, May 15, 2008; 24(10): 1225 - 1228. [Abstract] [Full Text] [PDF] |
||||
![]() |
Z. Bieniawska, C. Espinoza, A. Schlereth, R. Sulpice, D. K. Hincha, and M. A. Hannah Disruption of the Arabidopsis Circadian Clock Is Responsible for Extensive Variation in the Cold-Responsive Transcriptome Plant Physiology, May 1, 2008; 147(1): 263 - 279. [Abstract] [Full Text] [PDF] |
||||
![]() |
L. A. Hazelwood, J.-M. Daran, A. J. A. van Maris, J. T. Pronk, and J. R. Dickinson The Ehrlich Pathway for Fusel Alcohol Production: a Century of Research on Saccharomyces cerevisiae Metabolism Appl. Envir. Microbiol., April 15, 2008; 74(8): 2259 - 2266. [Full Text] [PDF] |
||||
![]() |
M. Cossegal, P. Chambrier, S. Mbelo, S. Balzergue, M.-L. Martin-Magniette, A. Moing, C. Deborde, V. Guyon, P. Perez, and P. Rogowsky Transcriptional and Metabolic Adjustments in ADP-Glucose Pyrophosphorylase-Deficient bt2 Maize Kernels Plant Physiology, April 1, 2008; 146(4): 1553 - 1570. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. Cao, C. Chiarelli, O. Richman, K. Zarrabi, P. Kozarekar, and S. Zucker Membrane Type 1 Matrix Metalloproteinase Induces Epithelial-to-Mesenchymal Transition in Prostate Cancer J. Biol. Chem., March 7, 2008; 283(10): 6232 - 6240. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Ouillette, H. Erba, L. Kujawski, M. Kaminski, K. Shedden, and S. N. Malek Integrated Genomic Profiling of Chronic Lymphocytic Leukemia Identifies Subtypes of Deletion 13q14 Cancer Res., February 15, 2008; 68(4): 1012 - 1021. [Abstract] [Full Text] [PDF] |
||||
![]() |
P. Adler, J. Reimand, J. Janes, R. Kolde, H. Peterson, and J. Vilo KEGGanim: pathway animations for high-throughput data Bioinformatics, February 15, 2008; 24(4): 588 - 590. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. J. Cuthbertson, Y. Liao, L. Birnbaumer, and P. J. Blackshear Characterization of zfs1 as an mRNA-binding and -destabilizing Protein in Schizosaccharomyces pombe J. Biol. Chem., February 1, 2008; 283(5): 2586 - 2594. [Abstract] [Full Text] [PDF] |
||||
![]() |
X. Wang and I. M. El Naqa Prediction of both conserved and nonconserved microRNA targets in animals Bioinformatics, February 1, 2008; 24(3): 325 - 332. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. M. Hancock and A.-M. Mallon Phenobabelomics mouse phenotype data resources Brief Funct Genomic Proteomic, January 11, 2008; (2008) elm033v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
T. Obayashi, S. Hayashi, M. Shibaoka, M. Saeki, H. Ohta, and K. Kinoshita COXPRESdb: a database of coexpressed gene networks in mammals Nucleic Acids Res., January 11, 2008; 36(suppl_1): D77 - D82. [Abstract] [Full Text] [PDF] |
||||
![]() |
I. Scheinin, S. Myllykangas, I. Borze, T. Bohling, S. Knuutila, and J. Saharinen CanGEM: mining gene copy number changes in cancer Nucleic Acids Res., January 11, 2008; 36(suppl_1): D830 - D835. [Abstract] [Full Text] [PDF] |
||||
![]() |
J. J. Faith, M. E. Driscoll, V. A. Fusaro, E. J. Cosgrove, B. Hayete, F. S. Juhn, S. J. Schneider, and T. S. Gardner Many Microbe Microarrays Database: uniformly normalized Affymetrix compendia with structured experimental metadata Nucleic Acids Res., January 11, 2008; 36(suppl_1): D866 - D870. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Waters, S. Stasiewicz, B. Alex Merrick, K. Tomer, P. Bushel, R. Paules, N. Stegman, G. Nehls, K. J. Yost, C. H. Johnson, et al. CEBS Chemical Effects in Biological Systems: a public data repository integrating study design and toxicity data with microarray and proteomics data Nucleic Acids Res., January 11, 2008; 36(suppl_1): D892 - D900. [Abstract] [Full Text] [PDF] |
||||
![]() |
B. B. Stone, E. L. Stowe-Evans, R. M. Harper, R. B. Celaya, K. Ljung, G. Sandberg, and E. Liscum Disruptions in AUX1-Dependent Auxin Influx Alter Hypocotyl Phototropism in Arabidopsis Mol Plant, January 1, 2008; 1(1): 129 - 144. [Abstract] [Full Text] [PDF] |
||||
![]() |
W. Fujibuchi, L. Kiseleva, T. Taniguchi, H. Harada, and P. Horton CellMontage: similar expression profile search server Bioinformatics, November 15, 2007; 23(22): 3103 - 3104. [Abstract] [Full Text] [PDF] |
||||
![]() |
D. Geerts, C. J. Wallick, D.-L. T. Koomoa, J. Koster, R. Versteeg, R. C. V. Go, and A. S. Bachmann Expression of Prenylated Rab Acceptor 1 Domain Family, Member 2 (PRAF2) in Neuroblastoma: Correlation with Clinical Features, Cellular Localization, and Cerulenin-Mediated Apoptosis Regulation Clin. Cancer Res., November 1, 2007; 13(21): 6312 - 6319. [Abstract] [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||





























