[BioC] BCV increases with an increasing counts per million in RNAseq (edgeR)
gowtham
ragowthaman at gmail.com
Sat Jun 16 00:06:16 CEST 2012
Hi All, I am resending this email with fewer figures (in .gif format) as my
previous message was held for moderation (due to its size). And i dont seem
to cancel the message either (the link was broken). Sorry about this.
#---#
Hi Everyone,
I analyse my current RNAseq data set (two groups; each group with two
replicates) using classic edgeR. I see couple strange results that i am
trying to make sense of. I really appreciate any help from the list.
1) after filtering out tags for low reads (minimum of 1 cpm in each of 4
samples:dge[rowSums((cpm.dge > 1)) >=4, ]) and normalizing
(calcNormFactors), i create the BCV plot (attached:norm_filt_bcv.png). I
see CV going up along with CPM. But, when I dont filter and dont normalize
i see a traditional BCV plot (attached: nonorm_nofilt_bcv.png). Any idea
why this is the case?
Especially, the normalization factors are close to 1. (0.9747020
, 0.9756064, 0.9769463, 1.0764226) and filtering for all samples with
minimum of 1 CPM removed only 800 genes out of 8000 genes.
2) Most of the genes seems to have dispersion lower than common dispersion.
Aren't they supposed to be distributed on either side (which is the case
with nofilt-nonorm).
2) Similarly, I see a different MDS plot for both filtered (and normalized)
and unfiltered (non-normalized) datasets (attached). Wondering what is
going on?
Any suggestion/comments will be very helpful.
Thanks a lot in advance,
Gowthaman
PS: The calculated common dispersion is rather high. Disp = 0.14757 , BCV =
0.3841
--
Gowthaman
Bioinformatics Systems Programmer.
SBRI, 307 West lake Ave N Suite 500
Seattle, WA. 98109-5219
Phone : LAB 206-256-7188 (direct).
More information about the Bioconductor
mailing list