Nucleic Acids Research Advance Access published online on October 23, 2009
Nucleic Acids Research, doi:10.1093/nar/gkp853
Methods Online |
MMBGX: a method for estimating expression at the isoform level and detecting differential splicing using whole-transcript Affymetrix arrays
1Department of Epidemiology and Public Health, Imperial College London, Norfolk Place, London W2 1PG and 2Centre for Integrative Systems Biology, Imperial College London, South Kensington, London SW7 2AZ, UK
*To whom correspondence should be addressed. Tel: +44 (0) 2075941942; Email: ernest.turro{at}ic.ac.uk
Received September 7, 2009. Revised September 23, 2009. Accepted September 23, 2009.
Affymetrix has recently developed whole-transcript GeneChips—Gene and Exon arrays—which interrogate exons along the length of each gene. Although each probe on these arrays is intended to hybridize perfectly to only one transcriptional target, many probes match multiple transcripts located in different parts of the genome or alternative isoforms of the same gene. Existing statistical methods for estimating expression do not take this into account and are thus prone to producing inflated estimates. We propose a method, Multi-Mapping Bayesian Gene eXpression (MMBGX), which disaggregates the signal at multi-match probes. When applied to Gene arrays, MMBGX removes the upward bias of gene-level expression estimates. When applied to Exon arrays, it can further disaggregate the signal between alternative transcripts of the same gene, providing expression estimates of individual splice variants. We demonstrate the performance of MMBGX on simulated data and a tissue mixture data set. We then show that MMBGX can estimate the expression of alternative isoforms within one experimental condition, confirming our results by RT-PCR. Finally, we show that our method for detecting differential splicing has a lower error rate than standard exon-level approaches on a previously validated colon cancer data set.