[BioC] Microarray experiment design issues

Fri Jun 15 14:48:44 CEST 2012

Hi Ali

I think it is fine to discuss this type of question on this list (and it 
seems to be consistent with the statement on 
http://www.bioconductor.org/help/mailing-list ).

If your aim is to make a general statement about the compound effect, 
rather than on what happened on a particular day, and all other costs 
being equal, then B is preferable. I would not term this "loss of 
power": effects that you see in A but not in B are not easily 
reproducible, and thus plausibly of lower interest.

However, I am not sure how important this issue is compared to many 
other of your choices, and the biases and errors that they might 
introduce, such as: choice of cell line, choice of compound dose and 
incubation time, the sensitivity and specificity of the particular array 
platform. If you are worried about robust inference, then perhaps you 
should also consider which of these factors need to be scanned.

Most importantly, what will the resulting gene list be used for next? 
Nobody expects these lists to end up as "standalone truths", they 
usually have a purpose (e.g. hit picking for subsequent single gene 
research; search for biological themes and stories that give you warm 
fuzzy feeling; elucidation of the molecular target(s) of the drug, and 
perhaps again their downstream targets; clustering of drugs by 
similarity of the response; etc.) I often find that once you have sorted 
out these question, the data analytic strategy also becomes  more apparent.

	Best wishes
	Wolfgang

Ali Tofigh scripsit 06/15/2012 11:35 AM:
> Our goal is to measure the effects of a treatment on a specific cell line
> using gene expression microarrays (agilent 2-color). There are two possible
> experimental designs:
>
> A) perform the entire experiment in one day: split cells into 6 groups,
> treat 3 with compound and leave 3 untreated. This setup minimizes technical
> variation, but the list of differentially expressed genes will include some
> that are differentially expressed mainly due to the specific conditions on
> the day of the experiment (humidty levels, temperature, oxygen levels,
> etc).
>
> B) perform the experiment on three separate occasions: each day, split
> cells into two groups, treat only one with compound. An paired analyis
> would be appropriate here. This setup introduces noise (technical noise
> because of separate handling of the three pairs and noise from daily
> variation of the environmental conditions) and so we lose some statistical
> power. However, since the experiment is performed under slightly different
> environmental conditions, some of the condition-specific genes will no
> longer show up as differentially expressed and the list of genes would in
> this sense be more robust/reproducible.
>
> Does anyone have experience with both setups? I would like to know if the
> amount of variance that is introduced in setup B can be expected to be low
> enough to not lose too much power while producing a more robust set of
> differentially expressed genes.
>
> Cheers
> /Ali
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives: http://news.gmane.org/gmane.science.biology.informatics.conductor

-- 
Best wishes
	Wolfgang

Wolfgang Huber
EMBL
http://www.embl.de/research/units/genome_biology/huber