[Bioc-sig-seq] extract non-zero rows
Dario Strbenac
D.Strbenac at garvan.org.au
Sun Aug 28 09:00:22 CEST 2011
Ah, yes, that method is better. I forgot to use it my example.
- Dario.
---- Original message ----
>Date: Sat, 27 Aug 2011 13:53:16 +1000
>From: davismcc at googlemail.com (on behalf of Davis McCarthy <davis.mccarthy at balliol.ox.ac.uk>)
>Subject: Re: [Bioc-sig-seq] extract non-zero rows
>To: D.Strbenac at garvan.org.au
>Cc: Estefania Mancini <estefania.mancini at indear.com>, bioc-sig-sequencing at r-project.org
>
>Estefania and Dario
>
>A more efficient way to do this:
>> row.positive.counts <- apply(dup.data$counts, 1, function(a.row) sum(a.row > 0))
>
>would be this:
>row.positive.counts <- rowSums( dup.data$counts > 0 )
>
>You might prefer to use the functions rowSums(), rowMeans(),
>colSums(), colMeans() instead of apply(), where you can. They are much
>faster.
>
>Best wishes
>Davis
>
>
>
>On 26 August 2011 10:00, Dario Strbenac <D.Strbenac at garvan.org.au> wrote:
>> Hi Estefania,
>>
>> If you want both columns to be non-zero, you should do
>>
>
>> filtered <- dup.data[row.positive.counts == ncol(dup.data$counts), ]
>>
>> It makes a boolean vector for each row, then sums it, because TRUE is the same as 1, so the sum gives you how many columns are greater than zero. Then, the rows that have as many positive numbers as there are columns in the data frame are kept.
>>
>> To find unchanged genes, you might do
>>
>> unchanged <- dup.de.com$table[dup.de.com$table[, "logFC"] > -0.2 & dup.de.com$table[, "logFC"] < 0.2, ]
>>
>> replacing 0.2 with what you think the biggest fold change that unchanged genes might have.
>>
>> ---- Original message ----
>>>Date: Thu, 25 Aug 2011 11:39:03 -0300 (ART)
>>>From: bioc-sig-sequencing-bounces at r-project.org (on behalf of Estefania Mancini <estefania.mancini at indear.com>)
>>>Subject: [Bioc-sig-seq] extract non-zero rows
>>>To: bioc-sig-sequencing at r-project.org
>>>
>>>Dear all
>>>I have loaded and analyzed properly 4 454 dataset, corresponding to control and stress samples with their biological replicates.
>>>I would like to know if is possible to filter, in my DGEList object
>>>
>>>-which tags dont have zero in any column,
>>>-which of these tags could be consider "housekeeping" (at least with logFC near 0)
>>>
>>>The object DGEList looks like this:
>>>
>>>>dup.data
>>>An object of class "DGEList"
>>>$samples
>>> group lib.size norm.factors
>>>A8_control control 77953 1
>>>A8_stress stress 176860 1
>>>mq_control control 98109 1
>>>mq_stress stress 145839 1
>>>pi_control control 132479 1
>>>pi_stress stress 142484 1
>>>tj_control control 65827 1
>>>tj_stress stress 144278 1
>>>
>>>I have tried to filter using the suggested function:
>>>>dup.de.filter <- dup.data[rowSums(dup.data$counts) >= 0, ]
>>>or with
>>>>dup.de.filter <- dup.data[rowSums(dup.data$counts) >= 1, ]
>>>but have no changes at all. I have many rows which 0 and 1 read in some column which should be excluded.
>>>
>>>Also:
>>>dup.de.com
>>>An object of class "DGEExact"
>>>$table
>>> logConc logFC p.value
>>>Glyma13g11940.8 -2.588833 0.26176050 0.7348221
>>>Glyma13g11900.1 -2.875548 0.03020441 0.9688072
>>>Glyma09g24780.1 -3.501041 -0.12108619 0.8754371
>>>Glyma13g12050.1 -3.224648 0.03036675 0.9691009
>>>Glyma13g12070.1 -3.743064 0.14416487 0.8521188
>>>19860 more rows ...
>>>
>>>$comparison
>>>[1] "control" "stress"
>>>$genes
>>>NULL
>>>
>>>Thanks in advance,
>>>Estefania
>>>
>>>_______________________________________________
>>>Bioc-sig-sequencing mailing list
>>>Bioc-sig-sequencing at r-project.org
>>>https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>>
>> --------------------------------------
>> Dario Strbenac
>> Research Assistant
>> Cancer Epigenetics
>> Garvan Institute of Medical Research
>> Darlinghurst NSW 2010
>> Australia
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia
More information about the Bioc-sig-sequencing
mailing list