[BioC] Microarray experiment design issues
Wolfgang Huber
whuber at embl.de
Fri Jun 15 14:48:44 CEST 2012
Hi Ali
I think it is fine to discuss this type of question on this list (and it
seems to be consistent with the statement on
http://www.bioconductor.org/help/mailing-list ).
If your aim is to make a general statement about the compound effect,
rather than on what happened on a particular day, and all other costs
being equal, then B is preferable. I would not term this "loss of
power": effects that you see in A but not in B are not easily
reproducible, and thus plausibly of lower interest.
However, I am not sure how important this issue is compared to many
other of your choices, and the biases and errors that they might
introduce, such as: choice of cell line, choice of compound dose and
incubation time, the sensitivity and specificity of the particular array
platform. If you are worried about robust inference, then perhaps you
should also consider which of these factors need to be scanned.
Most importantly, what will the resulting gene list be used for next?
Nobody expects these lists to end up as "standalone truths", they
usually have a purpose (e.g. hit picking for subsequent single gene
research; search for biological themes and stories that give you warm
fuzzy feeling; elucidation of the molecular target(s) of the drug, and
perhaps again their downstream targets; clustering of drugs by
similarity of the response; etc.) I often find that once you have sorted
out these question, the data analytic strategy also becomes more apparent.
Best wishes
Wolfgang
Ali Tofigh scripsit 06/15/2012 11:35 AM:
> Our goal is to measure the effects of a treatment on a specific cell line
> using gene expression microarrays (agilent 2-color). There are two possible
> experimental designs:
>
> A) perform the entire experiment in one day: split cells into 6 groups,
> treat 3 with compound and leave 3 untreated. This setup minimizes technical
> variation, but the list of differentially expressed genes will include some
> that are differentially expressed mainly due to the specific conditions on
> the day of the experiment (humidty levels, temperature, oxygen levels,
> etc).
>
> B) perform the experiment on three separate occasions: each day, split
> cells into two groups, treat only one with compound. An paired analyis
> would be appropriate here. This setup introduces noise (technical noise
> because of separate handling of the three pairs and noise from daily
> variation of the environmental conditions) and so we lose some statistical
> power. However, since the experiment is performed under slightly different
> environmental conditions, some of the condition-specific genes will no
> longer show up as differentially expressed and the list of genes would in
> this sense be more robust/reproducible.
>
> Does anyone have experience with both setups? I would like to know if the
> amount of variance that is introduced in setup B can be expected to be low
> enough to not lose too much power while producing a more robust set of
> differentially expressed genes.
>
> Cheers
> /Ali
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor
--
Best wishes
Wolfgang
Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber
More information about the Bioconductor
mailing list