[BioC] Using limma to analyze GEO datasets/series from two-channel experiments
Ana Rodrigues
arodrigues at salk.edu
Sat Oct 3 00:00:08 CEST 2009
Dear Gurus,
I am attempting to analyze a bunch of microarray experiments from the
GEO database.
I have experience with Affymetrix chips, so it was reasonably simple
to download the datasets/series of interest, retrieve the relevant
columns from the GSM files (figure out whether they were normalized,
logged, etc), and perform the comparisons I need using limma.
Now I am struggling to do the same for other platforms, in particular,
two-color platforms.
The first few such experiments I have looked at look reasonably
simple. However, I aven't been able to figure out how to obtain a
data structure that lmFit can use from the GSM files.
I decided to try the GEOquery package to interface with GEO.
gse <- getGEO("GSE2998")
exprs <- exprs(gse[[1]])
The exprs matrix now contains the VALUE column from each GSM file,
which in this particular case is "The log2-transformed ratio of the
Lowess-normalized fluorescence values (Ch2/Ch1) exported from
GeneTraffic".
For one of the comparisons that I am interested in, there are two
chips of relevance.
GSM65523, with treated Cy3 and untreated Cy5
GSM65567, with treated Cy3 and untreated Cy5
I thought that the best way to compare treated to untreated would be
something like:
targets <- matrix(c("GSM65523", "noHS", "HS",
"GSM65567", "noHS", "HS"), ncol=3, byrow=TRUE,
dimnames=list(NULL, c("SlideNumber", "Cy3", "Cy5")))
design <- modelMatrix(targets, ref="noHS")
lmFit(exprs, design)
But, of course, exprs doesn't contain any channel info, just the log
ratio between the channels.
Should I be retrieving different columns from the GSM files? How can I
build a data structure from that data that lmFit can use? Is there a
better way to do simple comparisons of two-channel GEO datasets?
Thank you so much for any help you can provide!
Best,
Ana
More information about the Bioconductor
mailing list