Nucleic Acids Research Advance Access originally published online on March 27, 2007
Nucleic Acids Research 2007 35(7):2343-2355; doi:10.1093/nar/gkm119
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Nucleic Acids Research, 2007, Vol. 35, No. 7 2343-2355
© 2007 The Author(s)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.
Computational Biology |
Graph-based identification of cancer signaling pathways from published gene expression signatures using PubLiME
1The FIRC Institute of Molecular Oncology Foundation, Via Adamello 16, 20139 Milan, Italy and 2Department of Experimental Oncology, European Institute of Oncology, Via Ripamonti 435, 20141 Milan, Italy
*To whom correspondence should be addressed. Tel: +39 02 574303263; Fax: +39 02 574303244; Email: heiko.muller{at}ifom-ieo-campus.it
Received December 6, 2006. Revised January 22, 2007. Accepted February 12, 2007.
Gene expression technology has become a routine application in many laboratories and has provided large amounts of gene expression signatures that have been identified in a variety of cancer types. Interpretation of gene expression signatures would profit from the availability of a procedure capable of assigning differentially regulated genes or entire gene signatures to defined cancer signaling pathways. Here we describe a graph-based approach that identifies cancer signaling pathways from published gene expression signatures. Published gene expression signatures are collected in a database (PubLiME: Published Lists of Microarray Experiments) enabled for cross-platform gene annotation. Significant co-occurrence modules composed of up to 10 genes in different gene expression signatures are identified. Significantly co-occurring genes are linked by an edge in an undirected graph. Edge-betweenness and k-clique clustering combined with graph modularity as a quality measure are used to identify communities in the resulting graph. The identified communities consist of cell cycle, apoptosis, phosphorylation cascade, extra cellular matrix, interferon and immune response regulators as well as communities of unknown function. The genes constituting different communities are characterized by common genomic features and strongly enriched cis-regulatory modules in their upstream regulatory regions that are consistent with pathway assignment of those genes.
The authors wish it to be known that, in their opinion, the first two authors should be regarded as joint First Authors.