[BioC] edgeR - no sequences are differentially expressed

Thu Oct 4 13:01:33 CEST 2012

Dear All,

When using edgeR for differential expression of RNA-Sequences, I do not get
any differentially expressed sequences (FDR <= 0.1).

The R code is attached. Note that we have 5 libraries and 2 groups such
that the first three libraries (group 1)

are compared against the last two libraries (group 2). The results are:

Loading required package: methods

Loading required package: limma

Calculating library sizes from column totals.

initial lib sizes:  125765 256153 462073 445484 773405

initial no. of tags:  379196

reduced lib sizes:  104768 160197 382981 373739 599384

reduced no. of tags:  25749

common dispersion analysis:  0.7386983

down regulated:  0   up regulated:  0

Using grid search to estimate tagwise dispersion.

tagwise dispersion analysis:

    Min.  1st Qu.   Median     Mean  3rd Qu.     Max.

0.001001 0.001001 0.503800 0.435100 0.745200 1.299000

down regulated:  0   up regulated:  0

Apparently, all FDR values are 1.

Can you please advise how to proceed  ?

Regards

Idit Buch, Ph.D.

Senior bioinformatics researcher
-------------- next part --------------
nsamples<-5
FDR.cutoff <- 0.1
grpInput<-c(1,1,1,2,2)
readsfile<- "inputMatrix"

# The 1st row is the sequence
reads<-read.table(readsfile,header=FALSE,row.names=1,sep="")

library(edgeR)
group<-factor((grpInput),exclude=NULL)
ngroups<-length(levels(group))
prior.n<- 25/(nsamples-ngroups)

data<-DGEList(counts=reads,group=group)
cat("initial lib sizes: ",data$samples$lib.size,"\n")
cat("initial no. of tags: ",dim(data)[1],"\n")
#
# filter out very lowly expressed tags, and those which are
# expressed in more than half the number of samples
#
cpm.data <- cpm(data)
data <- data[(rowSums(cpm.data > 1)) >= (nsamples/ngroups),]
data <- data[(rowSums(data$count > 0)) >= (nsamples/ngroups),]
data$samples$lib.size <- colSums(data$counts)
cat("reduced lib sizes: ",data$samples$lib.size,"\n")
cat("reduced no. of tags: ",dim(data)[1],"\n")

data<-calcNormFactors(data)
data<-estimateCommonDisp(data)
de.com<-exactTest(data)
sortedDE.com<-topTags(de.com,n=NROW(de.com))

# SUMMARY
cat ("common dispersion analysis: ", data$common.dispersion,"\n")
downReg <- sum(sortedDE.com$table$logFC < 0 & sortedDE.com$table$FDR <= FDR.cutoff)
upReg <- sum(sortedDE.com$table$logFC > 0 & sortedDE.com$table$FDR <= FDR.cutoff)
cat ("down regulated: ",downReg, "  up regulated: ",upReg, "\n")

#DE tagwise
data<-estimateTagwiseDisp(data, prior.n=prior.n, prop.used=0.5, grid.length=500, verbose=TRUE)
de.tgw<-exactTest(data)
sortedDE.tgw<-topTags(de.tgw,n=NROW(de.tgw))

# SUMMARY
cat ("\ntagwise dispersion analysis:\n")
summary(data$tagwise.dispersion)
tgw.dispersion.summary = summary(data$tagwise.dispersion)
downReg <- sum(sortedDE.tgw$table$logFC < 0 & sortedDE.tgw$table$FDR <= FDR.cutoff)
upReg <- sum(sortedDE.tgw$table$logFC > 0 & sortedDE.tgw$table$FDR <= FDR.cutoff)
cat ("down regulated: ",downReg, "  up regulated: ",upReg, "\n")