[Bioc-sig-seq] extract non-zero rows

Mon Aug 29 15:42:12 CEST 2011

Dear all,

Thank you for your prompt reply.

I wonder if it is better filter the data before do exact test and only analyse this few rows or analyze with all data and filter after exact test.

In my case I have a difference of 19000 rows without filter and only 300 filtering non zero rows

The aim of the filter is to see which genes are mapped in all samples. 

Thanks in advance

Estefania

----- Mensaje original -----
De: bioc-sig-sequencing-request at r-project.org
Para: bioc-sig-sequencing at r-project.org
Enviados: Domingo, 28 de Agosto 2011 7:00:04
Asunto: Bioc-sig-sequencing Digest, Vol 42, Issue 11

Send Bioc-sig-sequencing mailing list submissions to
	bioc-sig-sequencing at r-project.org

To subscribe or unsubscribe via the World Wide Web, visit
	https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
or, via email, send a message with subject or body 'help' to
	bioc-sig-sequencing-request at r-project.org

You can reach the person managing the list at
	bioc-sig-sequencing-owner at r-project.org

When replying, please edit your Subject line so it is more specific
than "Re: Contents of Bioc-sig-sequencing digest..."

Today's Topics:

   1. Re: extract non-zero rows (Dario Strbenac)

----------------------------------------------------------------------

Message: 1
Date: Sun, 28 Aug 2011 17:00:22 +1000 (EST)
From: Dario Strbenac <D.Strbenac at garvan.org.au>
To: "Davis McCarthy" <davis.mccarthy at balliol.ox.ac.uk>
Cc: Estefania Mancini <estefania.mancini at indear.com>,
	bioc-sig-sequencing at r-project.org
Subject: Re: [Bioc-sig-seq] extract non-zero rows
Message-ID: <20110828170022.BMR49151 at gimr.garvan.unsw.edu.au>
Content-Type: text/plain; charset=iso-8859-1

Ah, yes, that method is better. I forgot to use it my example.

- Dario.

---- Original message ----
>Date: Sat, 27 Aug 2011 13:53:16 +1000
>From: davismcc at googlemail.com (on behalf of Davis McCarthy <davis.mccarthy at balliol.ox.ac.uk>)
>Subject: Re: [Bioc-sig-seq] extract non-zero rows  
>To: D.Strbenac at garvan.org.au
>Cc: Estefania Mancini <estefania.mancini at indear.com>, bioc-sig-sequencing at r-project.org
>
>Estefania and Dario
>
>A more efficient way to do this:
>> row.positive.counts <- apply(dup.data$counts, 1, function(a.row) sum(a.row > 0))
>
>would be this:
>row.positive.counts <- rowSums( dup.data$counts > 0 )
>
>You might prefer to use the functions rowSums(), rowMeans(),
>colSums(), colMeans() instead of apply(), where you can. They are much
>faster.
>
>Best wishes
>Davis
>
>
>
>On 26 August 2011 10:00, Dario Strbenac <D.Strbenac at garvan.org.au> wrote:
>> Hi Estefania,
>>
>> If you want both columns to be non-zero, you should do
>>
>
>> filtered <- dup.data[row.positive.counts == ncol(dup.data$counts), ]
>>
>> It makes a boolean vector for each row, then sums it, because TRUE is the same as 1, so the sum gives you how many columns are greater than zero. Then, the rows that have as many positive numbers as there are columns in the data frame are kept.
>>
>> To find unchanged genes, you might do
>>
>> unchanged <- dup.de.com$table[dup.de.com$table[, "logFC"] > -0.2 & dup.de.com$table[, "logFC"] < 0.2, ]
>>
>> replacing 0.2 with what you think the biggest fold change that unchanged genes might have.
>>
>> ---- Original message ----
>>>Date: Thu, 25 Aug 2011 11:39:03 -0300 (ART)
>>>From: bioc-sig-sequencing-bounces at r-project.org (on behalf of Estefania Mancini <estefania.mancini at indear.com>)
>>>Subject: [Bioc-sig-seq] extract non-zero rows
>>>To: bioc-sig-sequencing at r-project.org
>>>
>>>Dear all
>>>I have loaded and analyzed properly 4 454 dataset, corresponding to control and stress samples with their biological replicates.
>>>I would like to know if is possible to filter, in my DGEList ?object
>>>
>>>-which tags dont have zero in any column,
>>>-which of these tags could be consider "housekeeping" (at least with logFC near 0)
>>>
>>>The object ?DGEList ?looks like this:
>>>
>>>>dup.data
>>>An object of class "DGEList"
>>>$samples
>>> ? ? ? ? ? ? group lib.size norm.factors
>>>A8_control control ? ?77953 ? ? ? ? ? ?1
>>>A8_stress ? stress ? 176860 ? ? ? ? ? ?1
>>>mq_control control ? ?98109 ? ? ? ? ? ?1
>>>mq_stress ? stress ? 145839 ? ? ? ? ? ?1
>>>pi_control control ? 132479 ? ? ? ? ? ?1
>>>pi_stress ? stress ? 142484 ? ? ? ? ? ?1
>>>tj_control control ? ?65827 ? ? ? ? ? ?1
>>>tj_stress ? stress ? 144278 ? ? ? ? ? ?1
>>>
>>>I have tried to filter using the suggested function:
>>>>dup.de.filter <- dup.data[rowSums(dup.data$counts) >= 0, ]
>>>or with
>>>>dup.de.filter <- dup.data[rowSums(dup.data$counts) >= 1, ]
>>>but have no changes at all. I have many rows which 0 and 1 read in some column which should be excluded.
>>>
>>>Also:
>>>dup.de.com
>>>An object of class "DGEExact"
>>>$table
>>> ? ? ? ? ? ? ? ? ?logConc ? ? ? logFC ? p.value
>>>Glyma13g11940.8 -2.588833 ?0.26176050 0.7348221
>>>Glyma13g11900.1 -2.875548 ?0.03020441 0.9688072
>>>Glyma09g24780.1 -3.501041 -0.12108619 0.8754371
>>>Glyma13g12050.1 -3.224648 ?0.03036675 0.9691009
>>>Glyma13g12070.1 -3.743064 ?0.14416487 0.8521188
>>>19860 more rows ...
>>>
>>>$comparison
>>>[1] "control" "stress"
>>>$genes
>>>NULL
>>>
>>>Thanks in advance,
>>>Estefania
>>>
>>>_______________________________________________
>>>Bioc-sig-sequencing mailing list
>>>Bioc-sig-sequencing at r-project.org
>>>https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>>
>> --------------------------------------
>> Dario Strbenac
>> Research Assistant
>> Cancer Epigenetics
>> Garvan Institute of Medical Research
>> Darlinghurst NSW 2010
>> Australia
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>

--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia

------------------------------

_______________________________________________
Bioc-sig-sequencing mailing list
Bioc-sig-sequencing at r-project.org
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing

End of Bioc-sig-sequencing Digest, Vol 42, Issue 11