[BioC] Summarizing Single-channel Agilent data

Gordon K Smyth smyth at wehi.EDU.AU
Sun Mar 4 02:40:53 CET 2012


Dear David,

Here's a complete limma analysis of the Agilent data on ArrayExpress.  It 
would appear that only the 10 ml/kg treatment is different from the saline 
control.  There are 12000 differential expressed genes for the 10 ml/kg 
treatment.

Best wishes
Gordon

> SDRF <- read.delim("E-GEOD-33005.sdrf.txt",check.names=FALSE,stringsAsFactors=FALSE)
> x <- read.maimages(SDRF[,"Array Data File"],source="agilent.median",green.only=TRUE)
Read GSM819076_US10283824_252828210181_S01_GE1_107_Sep09_1_4.txt
Read GSM819075_US10283824_252828210181_S01_GE1_107_Sep09_1_3.txt
Read GSM819074_US10283824_252828210181_S01_GE1_107_Sep09_1_2.txt
Read GSM819073_US10283824_252828210180_S01_GE1_107_Sep09_1_4.txt
Read GSM819072_US10283824_252828210180_S01_GE1_107_Sep09_1_3.txt
Read GSM819071_US10283824_252828210180_S01_GE1_107_Sep09_1_2.txt
Read GSM819070_US10283824_252828210180_S01_GE1_107_Sep09_1_1.txt
Read GSM819069_US10283824_252828210179_S01_GE1_107_Sep09_1_4.txt
Read GSM819068_US10283824_252828210179_S01_GE1_107_Sep09_1_3.txt
Read GSM819067_US10283824_252828210179_S01_GE1_107_Sep09_1_2.txt
Read GSM819066_US10283824_252828210179_S01_GE1_107_Sep09_1_1.txt
Read GSM819065_US10283824_252828210178_S01_GE1_107_Sep09_1_4.txt
Read GSM819064_US10283824_252828210178_S01_GE1_107_Sep09_1_3.txt
Read GSM819063_US10283824_252828210178_S01_GE1_107_Sep09_1_2.txt
Read GSM819062_US10283824_252828210178_S01_GE1_107_Sep09_1_1.txt
Read GSM819061_US10283824_252828210177_S01_GE1_107_Sep09_1_4.txt
Read GSM819060_US10283824_252828210177_S01_GE1_107_Sep09_1_3.txt
Read GSM819059_US10283824_252828210177_S01_GE1_107_Sep09_1_2.txt
Read GSM819058_US10283824_252828210177_S01_GE1_107_Sep09_1_1.txt
> y <- backgroundCorrect(x,method="normexp")
> y <- normalizeBetweenArrays(y,method="quantile")
> neg99 <- apply(y$E[y$genes$ControlType==-1,],2,function(x) quantile(x,p=0.99))
> cutoff <- matrix(neg99,nrow(y),ncol(y),byrow=TRUE)
> isexpr <- rowSums(y$E > cutoff) >= 4
> table(isexpr)
isexpr
FALSE  TRUE
  4321 39933
> y0 <- y[y$genes$ControlType==0 & isexpr,]

> Treatment <- SDRF[,"Characteristics[treatment]"]
> levels <- c("10 ml/kg saline","2 ml/kg corn oil","5 ml/kg corn oil","10 
ml/kg corn oil")
> Treatment <- factor(Treatment,levels=levels)
> design <- model.matrix(~Treatment)

> yave <- avereps(y0,ID=y0$genes[,"SystematicName"])
> fit <- lmFit(yave,design)
> fit <- eBayes(fit,trend=TRUE)
> summary(decideTests(fit[,-1]))
    Treatment2 ml/kg corn oil Treatment5 ml/kg corn oil Treatment10 ml/kg 
corn oil
-1        0        0      384
  0    24433    24433    23207
  1        0        0      842




On Sun, 4 Mar 2012, Gordon K Smyth wrote:

> Dear David,
>
> What you do mean by "summarizing".  Are you perhaps looking for avereps()?
>
> BTW, the read can be done slightly more succintly by
>
>  RG <- read.maimages(targets,source="agilent.median",green.only=TRUE)
>
> Best wishes
> Gordon
>
>> Date: Fri, 2 Mar 2012 15:51:00 +0100
>> From: David Westergaard <david at harsk.dk>
>> To: bioconductor at r-project.org
>> Subject: [BioC]  Summarizing Single-channel Agilent data
>> 
>> Hello,
>> 
>> I am working on normalizing raw data from
>> http://www.ebi.ac.uk/arrayexpress/experiments/E-GEOD-33005 using the
>> Limma package.
>> 
>> Following "standard" procedure, I do background correction, and then 
>> normalize:
>> 
>> # Read target from file
>> targets <- readTargets("targets")
>> RG <- read.maimages(targets,source="agilent", columns =list(G =
>> "gMedianSignal", Gb = "gBGMedianSignal"),green.only=TRUE)
>> 
>> 
>> # Do backgroundcorrection/normalization
>> RG <- backgroundCorrect(RG, method="normexp")
>> RG <- normalizeBetweenArrays(RG, method="quantile")
>> 
>> Now, what I'm lacking is a summarization method. Googling abit,
>> "Agi4x44PreProcess" can do the summarization, but it doesn't accept
>> single-channel data. Furthermore, it expects an RGList as input to
>> summarize.probe (NormalizeBetweenArrays produces an EList)
>> 
>> So how would I go about summarizing these data? It would be nice if
>> there was an existing package doing this.
>> 
>> 
>> Best Regards,
>> David Westergaard
>> Undergraduate student
>> Technical University of Denmark
>> 
>

______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}



More information about the Bioconductor mailing list