[BioC] variance and coefficient of variation with edgeR
Miguel Gallach
miguel.gallach at univie.ac.at
Tue Mar 27 18:12:01 CEST 2012
It seems I could not paste the plot... I hope you can see it now.
Sorry,
Miguel
=========
On Tue, Mar 27, 2012 at 4:22 PM, Miguel Gallach <miguel.gallach at univie.ac.at
> wrote:
> Dear list,
>
> I am analyzing RNA-Seq data with edgeR for a typical two factors design:
>
> $samples
> group lib.size norm.factors
> R4.Hot HotAdaptedHot 17409289 0.9881635
> R5.Hot HotAdaptedHot 17642552 1.0818144
> R9.Hot ColdAdaptedHot 20010974 0.8621807
> R10.Hot ColdAdaptedHot 14064143 0.8932791
> R4.Cold HotAdaptedCold 11968317 1.0061084
> R5.Cold HotAdaptedCold 11072832 1.0523857
> R9.Cold ColdAdaptedCold 22386103 1.0520949
> R10.Cold ColdAdaptedCold 17408532 1.0903311
>
>
> I found something quite interesting and is that non-native populations
> have systematically higher coefficient of variation than native
> populations. This is: CV (R4.Hot-R5Hot) < CV(R9.Hot-R10.Hot) and
> CV(R4.Cold-R5.Cold) > CV(R9.Cold-R10.Cold).
>
> Here you have the variables and calculations:
>
> C.V.R4.R5HC = sqrt (data$R4.R5.HC.disp)
> C.V.R9.R10HC = sqrt (data$R9.R10.HC.disp)
>
> var_R4.R5_HC=Conc.R4.R5.HC*(1+R4.R5.HC.disp*Conc.R4.R5.HC)
> var_R9.R10_HC=Conc.R9.R10.HC*(1+R9.R10.HC.disp*Conc.R9.R10.HC)
>
>
> The attached plot is the result of comparing variances (V = mu *( 1 +
> dispersion * mu ), according to
> http://seqanswers.com/forums/showthread.php?t=5591&highlight=edgeR+variance)
> and C.V. (C.V. = sqrt(dispersion)) between biological groups at Hot
> temperature (i.e., comparin R4.Hot-R5.Hot vs. R9.Hot-R10.Hot).
>
> According to the left plot we can conclude that for most genes the
> variance is equal and then the assumption of equal variances is true. Hence
> we can perform DE test. Am I right?
>
> However, something I cannot understand is that the sqrt(R9.R10) >
> sqrt(R4.R5), i.e., the coefficient of variation of gene expression is
> systematically higher for all genes from R9.R10 than those in R4.R5. For
> this to be true, since variances are equal and C.V. = sqrt(var)/mean, then
> the mean of R9.R10 (i.e., Con.R9.R10) should be lower than that for R4.R5,
> which is obviously false. The reciprocal analysis for these samples at cold
> temperatures produces the equivalent, but inverted, result.
>
> What am I missing? How can this happen?
>
>
> Any help would be appreciated.
>
> Many thanks,
> Miguel Gallach
>
>
>
>
>
>
>
>
--
Miguel Gallach
Center for Integrative Bioinformatics Vienna (CIBIV)
Max F. Perutz Laboratories(MFPL)
Telf: +43 1 4277 24029
Postal Address:
Ebene 1
Campus Vienna Biocenter 5
CIBIV, MFPL
1030 Vienna
Austria
e-mail:
miguel.gallach at univie.ac.at
migaca2001 at gmail.com
More information about the Bioconductor
mailing list