[BioC] edgeR: tagwise dispersion in 2-factorial vs. 1-factorial design
Gordon K Smyth
smyth at wehi.EDU.AU
Wed Apr 25 04:12:35 CEST 2012
Dear Henning,
Making decisions about how whether to analyse a data set as a whole or in
pieces depends on the specifics of your problem and your data, and there
is no univeral answer. I can tell you however that I almost always
analyse all the data from one study together, i.e., I would most often use
the 2-factorial approach. Generally it pays to pool information about the
dispersion from multiple groups. Of course you should do some exploratory
analysis using a MDS plot or similar to see if there are any problem
libraries, for any of the three genotypes.
Best wishes
Gordon
> Date: Mon, 23 Apr 2012 14:07:55 +0200
> From: "Henning Wildhagen" <HWildhagen at gmx.de>
> To: bioconductor at r-project.org
> Subject: [BioC] edgeR: tagwise dispersion in 2-factorial vs.
> 1-factorial design
>
> Hi,
>
> i am analysing a two-factorial RNA-seq experiment with edgeR. The design
> of my study has two factors, genotype and treatment. Genotype has three
> levels (A,B,C), "treatment" has two levels ("control", "stress"). The
> first and most important question that i want to answer is which
> transcripts are affected by treatment in each of the three genotypes. I
> did this analysis by specifying a two-factorial model and subsequently
> selecting coefficients/contrasts to test for the treatment effect
> genotype-wise. Of course, this type of analysis can also be done in a
> 1-factorial way, i.e. by defining three separate DGEList-objects for
> each genotype and then performing an exactTest for the treatment effect
> for each of the three DGEList-objects/genotypes. For one of the
> genotypes, say "A", the latter analysis gives approximately 60% more DE
> genes compared to the DE-analysis based on the 2-factorial model. For
> the other two genotypes, the number of DE genes is almost the same in
> the two analyses. My first guess was, that this finding this related to
> the differences in the estimation of the tagwise dispersion. In the
> two-factorial analysis, one and the same dispersion estimate per
> transcript is used to test for DE. In the 1-factorial analysis, three
> dispersion estimates are calculated per transcript, one for each
> genotype. When comparing the distributions of genotype-wise dispersion
> estimates of the 1-factorial analysis with the "common" tagwise
> dispersion of the 2-factorial model, i see that the median is higher and
> the range of the 95%tiles is wider for genotypes B, C and the "common"
> dispersion of the 2-factorial model, compared to genotype "A".
> Now my question is which analysis is more reliable, the 2-factorial or the 1-factorial?
>
> Thanks for any help or comments on this problem,
>
> Henning
>
> ------------------------------------------------------
> Dr. Henning Wildhagen
> Forest Research Institute Baden-W?rttemberg
> Freiburg, Germany
> --
>
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioconductor
mailing list