[BioC] pm summarization method
James W. MacDonald
jmacdon at uw.edu
Mon Apr 16 17:54:07 CEST 2012
Hi Assa,
On 4/16/2012 9:05 AM, Assa Yeroslaviz wrote:
> Hi everybody,
>
> I have a question about the behavior of the expresso command when
> extracting the raw data from an affyBatch.
>
> I wanted to evaluate the raw intensities values of a specific gene from my
> data set and tried to extract it like that:
> rawdata<- expresso(totalExpressionData, bg.correct=FALSE,
> normalize=FALSE,
> pmcorrect.method="pmonly", summary.method="avgdiff",
> verbose=TRUE)
>
> I've got the result I wanted:
> wt1 wt2 wt3 treat1 treat2 treat3
> gene_at 125.5 101 123.5 52.5 63.5 58
>
> The problem was that i expected them to be the other way around.
Why did you expect it to be the other way around? What exactly did you
expect this call to expresso() to do?
Since affy is primarily based on S4 methods, it can be a bit difficult
to figure out what a given function is going to do, so I can understand
you not knowing what this call to expresso() is going to end up doing.
However, what you are doing is pretty weird, no? The avgdiff method
implies pm-mm, so what do you expect to happen if you then specify pmonly?
Given that the pmcorrect.method controls how we correct the PM probes,
and there is a subtractmm option, one would normally assume that the
'difference' part of avgdiff might happen in that step. But you said not
to compute that, so all you are left with is the 'avg' part of avgdiff.
But let's set the logical assumptions aside and look at the actual code.
Going through the code to expresso() is a bit like following Alice down
the rabbit hole, so I will cut to the chase. In the end, you will be
calling two pieces of code that will handle the pm adjustment and the
summary statistic calculation. In your call to expresso() these will be
(respectively):
> pmcorrect.pmonly
function (object)
{
return(pm(object))
}
and
> generateExprVal.method.avgdiff
function (probes, ...)
{
list(exprs = apply(probes, 2, median), se.exprs = apply(probes,
2, sd)/sqrt(nrow(probes)))
}
So the pmonly will just give you the pm probes, and then avgdiff will
give you the column medians of the pm data. Therefore you have basically
told expresso() that you want the median value for the non-background
corrected, unnormalized pm probes.
Was that your intention?
Best,
Jim
> So I decided to look into the specific probe values of the probes for this
> probe-set.
> This are the values I've got from the PM and MM respectively:
> wt1 wt2 wt3 treat1 treat2 treat3
> probe1 403 379 220 420 530 316
> probe2 117 84 104 52 57 54
> probe3 49 49 73 38 58 52
> probe4 87 67 110 55 43 49
> probe5 66 61 51 46 72 62
> probe6 118 100 104 69 87 74
> probe7 180 142 170 45 46 45
> probe8 133 102 137 95 132 81
> probe9 80 71 65 52 54 46
> probe10 63 45 56 53 53 54
> probe11 293 321 260 444 618 408
> probe12 171 167 169 49 75 72
> probe13 198 197 307 40 67 50
> probe14 247 265 348 53 60 62
>
> probe1 533 519 294 507 739 404
> probe2 1789 1271 1468 1430 1666 1552
> probe3 56 66 59 51 45 48
> probe4 49 52 64 47 45 47
> probe5 54 47 33 49 65 55
> probe6 84 72 90 53 92 71
> probe7 73 72 65 40 53 54
> probe8 83 108 115 81 111 94
> probe9 49 56 43 52 41 53
> probe10 56 46 62 68 77 57
> probe11 54 83 55 47 64 46
> probe12 106 98 76 52 66 53
> probe13 43 48 37 36 52 39
> probe14 94 92 99 43 43 49
>
> When I calculate the average of these two tables for each array I don't get
> the same values as presented in the top table.
> I would like to understand how from the values on the last two tables I
> come to a summarized value I get. Even if I ignore the MM values
> completely, which I think it does, I still don't see how it comes to these
> values. The two probes (Nr. 1 and 11 of the PM values) are strongly differ
> from the rest of the probes for this probe-set. Are they being ignored in
> the summarization?
>
> Thanks in advance
>
> Assa
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:http://news.gmane.org/gmane.science.biology.informatics.conductor
--
James W. MacDonald, M.S.
Biostatistician
University of Washington
Environmental and Occupational Health Sciences
4225 Roosevelt Way NE, # 100
Seattle WA 98105-6099
More information about the Bioconductor
mailing list