[BioC] paired microarray samples
Chris Fjell
cfjell at interchange.ubc.ca
Fri Oct 16 21:05:36 CEST 2009
I've a question on how to properly deal with paired samples in limma.
For example, I've got an experiment with 3 blood donors (D1, D2, D3) and
4 treatments (T1, T2, T3, T4) of the blood cells. One single-channel
array is done for each donor-treatment (e.g. D1 cells treated with T2).
In the past I've treated these as simple biological replicates, so the
code flow is like:
treatments=pData(BSData.quantile)$Sample_Group # T1, T2, T3, T4, from
Illumina beadarray data
treatments=as.factor(treatments)
design=model.matrix(~0+treatments)
colnames(design)=levels(treatments)
fit = lmFit(exprs(BSData.quantile), design)
cont.matrix = makeContrasts(T1vT2 = T1 - T2, T3vT1 = T3 - T1, levels =
design)
fit = contrasts.fit(fit, cont.matrix)
ebFit = eBayes(fit)
for( comp in colnames( cont.matrix ) ) {
tt = topTable(ebFit, coef = comp)
...
[write results to file...]
}
But donors are responding quite differently so I don't want to simple
pool all the treatments, I want to consider response of each donor, i.e.
"Paired Samples".
I found this snippet on the BioC newsgroup at
https://stat.ethz.ch/pipermail/bioconductor/2006-May/012969.html
Calculate the logFC ratios manually per donor, then pass it to lmFit() -
seems like the standard paired t-test.
For my example, for the comparison of T1 to T2
exM = exprs(BSData.quantile)
diffD1 = exM[, which( donors == "D1" & samples == "T1" )] - exM[, which(
donors == "D1" & samples == "T2" )]
diffD2 = exM[, which( donors == "D2" & samples == "T1" )] - exM[, which(
donors == "D2" & samples == "T2" )]
diffD3 = exM[, which( donors == "D3" & samples == "T1" )] - exM[, which(
donors == "D3" & samples == "T2" )]
paired_ex = cbind( diffD1, diffD2, diffD3)
fit = lmFit(paired_ex)
The example cited above also uses "block = origin, correlation =
dupcor$cons" parameters to lmFit() but this is for technical replicates (?)
This works but I get dramatically worse p-values (due I assume to using
half the number of "samples": 3 ratios tested against 0 versus testing
two groups of 3).
But the limma user's guide (Oct 22, 2008 version, Chapter 8, p.40) gives
another example, using model.matrix() not makeContrasts():
> SibShip <- factor(targets$SibShip)
> Treat <- factor(targets$Treatment, levels=c("C","T"))
> design <- model.matrix(~SibShip+Treat)
> fit <- lmFit(eset, design)
> fit <- eBayes(fit)
> topTable(fit, coef="TreatT")
I do the similar thing for my data ...
design_paired_treatments = model.matrix( ~donors+treatments )
fit_ps = lmFit( exprs(BSData.quantile),design_paired_treatments)
fit_ps = eBayes( fit_ps )
and look at the columns of my design matrix and it's like
colnames( design_paired_treatments )
[1] "(Intercept)" "donorsD1" "donorsD2" "treatmentsT1"
[5] "treatmentsT2" "treatmentsT3"
Donor D3 and treatment T4 don't appear in the design.
and I get some results...
tt = topTable( fit_ps, coef="donorsD1")
tt
ID logFC AveExpr t P.Value adj.P.Val B
6370315 -6.58 8.74 -119.4 4.26e-17 2.08e-12 17.6
7160474 3.59 8.77 56.5 7.43e-14 1.37e-09 16.6
5260484 -5.69 8.71 -55.8 8.39e-14 1.37e-09 16.6
This is a regression model, I think... I can see why the two terms
(donor D3 and treatment T4) disappeared - they have to be restated in
terms of the rest, if I remember right for doing ANOVA with indicator
variables.
Is there a recommended way to do this with limma? Or another package?
Thanks for any advice.
-Chris
More information about the Bioconductor
mailing list