[BioC] liimma and Across Array Normalisation

Saket Choudhary saketkc at gmail.com
Tue Feb 11 23:55:15 CET 2014


On 11-Feb-2014, at 10:52 PM, Gordon K Smyth <smyth at wehi.edu.au> wrote:

> On Tue, 11 Feb 2014, Saket Choudhary wrote:
>
>> On 11-Feb-2014, at 10:31 PM, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>
>>> Yes, obviously there'll be a baseline shift when you subtract background, then add an offset and log transform.
>>>
>>> You plots do not appear to be a valid MA plots.
>>
>> Could you please point out the error?
>> I understand a base line shoft is expected, but I cant figure out what
>> is going wrong otherwise.
>
> Well, you manually create an MAList object from your single channel data, even though an MAList is strictly for two colour data.
>
> If you deceive limma as to the true nature of your data, it's not surprising that the resulting plot might not be correct.
>
> I am not clear why you need to make so many variations on the standard limma single channel analysis pipeline.
>

Is there any other way to visualise MA plots for single channel data?

> Gordon
>
>
>>
>> Thanks,
>> Saket
>>
>>
>>> Gordon
>>>
>>> On Tue, 11 Feb 2014, Saket Choudhary wrote:
>>>
>>>> Hello Gordon,
>>>>
>>>> Is there a reason to believe the MA plots should inherently be
>>>> baseline shifted after normalisation?
>>>>
>>>> Raw MA: https://db.tt/kDBod1EJ
>>>> background correction with 'nec': https://db.tt/0vVWeD21
>>>> background correction with nec followed by normalisation: https://db.tt/f0M0rWeg
>>>> background correction with 'normexp: https://db.tt/OJO0zea5
>>>> background correction with normexp followed by normalisation:
>>>> https://db.tt/rbLJmFBE
>>>>
>>>>
>>>> The files are a bit heavy so might take some time to load into any pdf reader.
>>>>
>>>> Code: https://gist.github.com/saketkc/8931951
>>>>
>>>> Saket
>>>>
>>>> On 9 February 2014 20:45, Saket Choudhary <saketkc at gmail.com> wrote:
>>>>> Related question: Similar to your case, my final topTable()'s output
>>>>> indicates  some genes having a negative logFC, though literature
>>>>> expects them to have a positive logFC.
>>>>>
>>>>> I looked up the calculations and the transition from positive to
>>>>> negative logFC for these genes seems to happen after the
>>>>> normalizeBetweenArrays step (irrespective of the kind of normalisation
>>>>> I choose).
>>>>>
>>>>> This is a naive question again, but I am trying to understand what should be
>>>>> a good metric to decide which method tends to give the least false
>>>>> positives like this, given tham I have limited knowledge of which
>>>>> genes should be up or down regulated(unlike in your case, where you
>>>>> knew the  kind  of regulation[up/down] expected).
>>>>>
>>>>> Thanks,
>>>>> Saket
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On 9 February 2014 04:00, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>>>>>
>>>>>> On Sat, 8 Feb 2014, Saket Choudhary wrote:
>>>>>>
>>>>>>> Hello Gordon,
>>>>>>>
>>>>>>> I had a chance to go through the paper. I have a set of negative and
>>>>>>> positive controls, arising out of single channel Genepix platform.
>>>>>>> From what I could gather, 'nec' method in limma performs
>>>>>>> backgroundcorrection using these negative control spots.
>>>>>>
>>>>>>
>>>>>> Yes, but the negative controls are assumed to behave exactly like probes for
>>>>>> unexpressed genes.  This is true for Illumina Beadchips, but is often not
>>>>>> the case for other platforms.  If not, then you would be better to stick
>>>>>> with normexp as you are already using.
>>>>>>
>>>>>>
>>>>>>> However one of the inputs to 'nec' is also "detection.p", which the
>>>>>>> .gprs don't have.
>>>>>>
>>>>>>
>>>>>> detection.p is not a required argument.  It is used only when negative
>>>>>> controls are not available.
>>>>>>
>>>>>>
>>>>>>> I could simply take a mean of all the negative controls E and Eb, and
>>>>>>> subtract it from each probe's E&Eb, doing it for all the arrays. Would
>>>>>>> this mimic what I want to acheive with the 'nec' function?
>>>>>>
>>>>>>
>>>>>> No, that naive approach is not equivalent and typically performs poorly.
>>>>>>
>>>>>> Gordon
>>>>>>
>>>>>>
>>>>>>> Saket
>>>>>>>
>>>>>>> On 6 February 2014 13:04, Saket Choudhary <saketkc at gmail.com> wrote:
>>>>>>>>
>>>>>>>> Hello Gordon,
>>>>>>>>
>>>>>>>> Unfortunately I do not have access to this as of now. I will however
>>>>>>>> get hold of it soon.
>>>>>>>>
>>>>>>>> After implementing this, I would expect the 'CONTROL' to have similar,
>>>>>>>> if not same values, right?
>>>>>>>>
>>>>>>>> However some of the values for these Control genes after the
>>>>>>>> normalisebetweenarray step have high variance. Is this behaviour
>>>>>>>> normal or am I missing something?
>>>>>>>>
>>>>>>>> Saket
>>>>>>>>
>>>>>>>> On 6 February 2014 06:32, Gordon K Smyth <smyth at wehi.edu.au> wrote:
>>>>>>>>>
>>>>>>>>> If 'x' is your background-corrected EList, then
>>>>>>>>>
>>>>>>>>> w <- rep(1,nrow(x))
>>>>>>>>> w[controls] <- 100
>>>>>>>>> y <- normalizeBetweenArrays(x, method="cyclicloess", weights=w)
>>>>>>>>>
>>>>>>>>> does what you want.
>>>>>>>>>
>>>>>>>>> For an example of this approach:
>>>>>>>>>
>>>>>>>>> http://rnajournal.cshlp.org/content/19/7/876
>>>>>>>>>
>>>>>>>>> Best wishes
>>>>>>>>> Gordon
>>>>>>>>>
>>>>>>>>> --------- original message ----------
>>>>>>>>> Saket Choudhary saketkc at gmail.com
>>>>>>>>> Thu Feb 6 06:59:42 CET 2014
>>>>>>>>>
>>>>>>>>> I am analysing a proteomics microarray data set for a two group
>>>>>>>>> sample(Normal and Disease) using single color channel. The arrays have a
>>>>>>>>> set
>>>>>>>>> of pre-defined CONTROL points whose expression levels are supposed to be
>>>>>>>>> similar/same across all the arrays.
>>>>>>>>>
>>>>>>>>> I would like to 'normalise' the levels of all probes such that
>>>>>>>>> normalisation
>>>>>>>>> ends up with all CONTROL points having similar expression levels. If I
>>>>>>>>> understand it right, normalizebetweenarray does not allow this kind of
>>>>>>>>> normalisation.
>>>>>>>>>
>>>>>>>>> Is there a pre-implemented function to do this? If not, what would be a
>>>>>>>>> way
>>>>>>>>> to acheive this kind of normalisation?
>>>>>>>>>
>>>>>>>>> Code: https://gist.github.com/saketkc/8669586
>>>>>>>>>
>>>>>>>>> ______________________________________________________________________
>>>>>>>>> The information in this email is confidential and intended solely for
>>>>>>>>> the
>>>>>>>>> addressee.
>>>>>>>>> You must not disclose, forward, print or use it without the permission
>>>>>>>>> of
>>>>>>>>> the sender.
>>>>>>>>> ______________________________________________________________________
>>>>>>
>>>>>> ______________________________________________________________________
>>>>>> The information in this email is confidential and intended solely for the
>>>>>> addressee.
>>>>>> You must not disclose, forward, print or use it without the permission of
>>>>>> the sender.
>>>>>> ______________________________________________________________________
>>>
>>> ______________________________________________________________________
>>> The information in this email is confidential and intended solely for the addressee.
>>> You must not disclose, forward, print or use it without the permission of the sender.
>>> ______________________________________________________________________
>
> ______________________________________________________________________
> The information in this email is confidential and inte...{{dropped:4}}



More information about the Bioconductor mailing list