[Bioc-sig-seq] extract non-zero rows
Dario Strbenac
D.Strbenac at garvan.org.au
Fri Aug 26 02:00:15 CEST 2011
Hi Estefania,
If you want both columns to be non-zero, you should do
row.positive.counts <- apply(dup.data$counts, 1, function(a.row) sum(a.row > 0))
filtered <- dup.data[row.positive.counts == ncol(dup.data$counts), ]
It makes a boolean vector for each row, then sums it, because TRUE is the same as 1, so the sum gives you how many columns are greater than zero. Then, the rows that have as many positive numbers as there are columns in the data frame are kept.
To find unchanged genes, you might do
unchanged <- dup.de.com$table[dup.de.com$table[, "logFC"] > -0.2 & dup.de.com$table[, "logFC"] < 0.2, ]
replacing 0.2 with what you think the biggest fold change that unchanged genes might have.
---- Original message ----
>Date: Thu, 25 Aug 2011 11:39:03 -0300 (ART)
>From: bioc-sig-sequencing-bounces at r-project.org (on behalf of Estefania Mancini <estefania.mancini at indear.com>)
>Subject: [Bioc-sig-seq] extract non-zero rows
>To: bioc-sig-sequencing at r-project.org
>
>Dear all
>I have loaded and analyzed properly 4 454 dataset, corresponding to control and stress samples with their biological replicates.
>I would like to know if is possible to filter, in my DGEList object
>
>-which tags dont have zero in any column,
>-which of these tags could be consider "housekeeping" (at least with logFC near 0)
>
>The object DGEList looks like this:
>
>>dup.data
>An object of class "DGEList"
>$samples
> group lib.size norm.factors
>A8_control control 77953 1
>A8_stress stress 176860 1
>mq_control control 98109 1
>mq_stress stress 145839 1
>pi_control control 132479 1
>pi_stress stress 142484 1
>tj_control control 65827 1
>tj_stress stress 144278 1
>
>I have tried to filter using the suggested function:
>>dup.de.filter <- dup.data[rowSums(dup.data$counts) >= 0, ]
>or with
>>dup.de.filter <- dup.data[rowSums(dup.data$counts) >= 1, ]
>but have no changes at all. I have many rows which 0 and 1 read in some column which should be excluded.
>
>Also:
>dup.de.com
>An object of class "DGEExact"
>$table
> logConc logFC p.value
>Glyma13g11940.8 -2.588833 0.26176050 0.7348221
>Glyma13g11900.1 -2.875548 0.03020441 0.9688072
>Glyma09g24780.1 -3.501041 -0.12108619 0.8754371
>Glyma13g12050.1 -3.224648 0.03036675 0.9691009
>Glyma13g12070.1 -3.743064 0.14416487 0.8521188
>19860 more rows ...
>
>$comparison
>[1] "control" "stress"
>$genes
>NULL
>
>Thanks in advance,
>Estefania
>
>_______________________________________________
>Bioc-sig-sequencing mailing list
>Bioc-sig-sequencing at r-project.org
>https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia
More information about the Bioc-sig-sequencing
mailing list