[BioC] edgeR - no sequences are differentially expressed
Idit Buch
iditbuch at gmail.com
Thu Oct 4 13:01:33 CEST 2012
Dear All,
When using edgeR for differential expression of RNA-Sequences, I do not get
any differentially expressed sequences (FDR <= 0.1).
The R code is attached. Note that we have 5 libraries and 2 groups such
that the first three libraries (group 1)
are compared against the last two libraries (group 2). The results are:
Loading required package: methods
Loading required package: limma
Calculating library sizes from column totals.
initial lib sizes: 125765 256153 462073 445484 773405
initial no. of tags: 379196
reduced lib sizes: 104768 160197 382981 373739 599384
reduced no. of tags: 25749
common dispersion analysis: 0.7386983
down regulated: 0 up regulated: 0
Using grid search to estimate tagwise dispersion.
tagwise dispersion analysis:
Min. 1st Qu. Median Mean 3rd Qu. Max.
0.001001 0.001001 0.503800 0.435100 0.745200 1.299000
down regulated: 0 up regulated: 0
Apparently, all FDR values are 1.
Can you please advise how to proceed ?
Regards
Idit Buch, Ph.D.
Senior bioinformatics researcher
-------------- next part --------------
nsamples<-5
FDR.cutoff <- 0.1
grpInput<-c(1,1,1,2,2)
readsfile<- "inputMatrix"
# The 1st row is the sequence
reads<-read.table(readsfile,header=FALSE,row.names=1,sep="")
library(edgeR)
group<-factor((grpInput),exclude=NULL)
ngroups<-length(levels(group))
prior.n<- 25/(nsamples-ngroups)
data<-DGEList(counts=reads,group=group)
cat("initial lib sizes: ",data$samples$lib.size,"\n")
cat("initial no. of tags: ",dim(data)[1],"\n")
#
# filter out very lowly expressed tags, and those which are
# expressed in more than half the number of samples
#
cpm.data <- cpm(data)
data <- data[(rowSums(cpm.data > 1)) >= (nsamples/ngroups),]
data <- data[(rowSums(data$count > 0)) >= (nsamples/ngroups),]
data$samples$lib.size <- colSums(data$counts)
cat("reduced lib sizes: ",data$samples$lib.size,"\n")
cat("reduced no. of tags: ",dim(data)[1],"\n")
data<-calcNormFactors(data)
data<-estimateCommonDisp(data)
de.com<-exactTest(data)
sortedDE.com<-topTags(de.com,n=NROW(de.com))
# SUMMARY
cat ("common dispersion analysis: ", data$common.dispersion,"\n")
downReg <- sum(sortedDE.com$table$logFC < 0 & sortedDE.com$table$FDR <= FDR.cutoff)
upReg <- sum(sortedDE.com$table$logFC > 0 & sortedDE.com$table$FDR <= FDR.cutoff)
cat ("down regulated: ",downReg, " up regulated: ",upReg, "\n")
#DE tagwise
data<-estimateTagwiseDisp(data, prior.n=prior.n, prop.used=0.5, grid.length=500, verbose=TRUE)
de.tgw<-exactTest(data)
sortedDE.tgw<-topTags(de.tgw,n=NROW(de.tgw))
# SUMMARY
cat ("\ntagwise dispersion analysis:\n")
summary(data$tagwise.dispersion)
tgw.dispersion.summary = summary(data$tagwise.dispersion)
downReg <- sum(sortedDE.tgw$table$logFC < 0 & sortedDE.tgw$table$FDR <= FDR.cutoff)
upReg <- sum(sortedDE.tgw$table$logFC > 0 & sortedDE.tgw$table$FDR <= FDR.cutoff)
cat ("down regulated: ",downReg, " up regulated: ",upReg, "\n")
More information about the Bioconductor
mailing list