[Bioc-sig-seq] Reads in 3'utr
Valerie Obenchain
vobencha at fhcrc.org
Thu Sep 29 04:56:34 CEST 2011
DESeq and edgeR vignettes.
Valerie
On 09/28/11 19:46, rohan bareja wrote:
> Hi,
>
> I have summed the counts now to the gene level for 3'UTR.
>
> I want to assess the relative amount of each 3’UTR end usage such as
> what percentage of reads comes from each 3’UTR isoform?
> I want to identify the different 3’UTR ends for each gene to get
> alternative 3'UTR usage(disease vs control)?
>
> Do you have any idea about how to proceed?
>
> Thanks,
> Rohan
>
>
>
> --- On *Sat, 24/9/11, Valerie Obenchain /<vobencha at fhcrc.org>/*wrote:
>
>
> From: Valerie Obenchain <vobencha at fhcrc.org>
> Subject: Re: [Bioc-sig-seq] Reads in 3'utr
> To: "rohan bareja" <rohan_1925 at yahoo.co.in>
> Cc: bioc-sig-sequencing at r-project.org
> Date: Saturday, 24 September, 2011, 4:24 AM
>
> On 09/23/2011 02:57 PM, rohan bareja wrote:
>> Hi,
>>
>> utr=threeUTRsByTranscript(txdb,use.names=FALSE)
>> So,utr is GRangesList of length 33381
>> Then as u said,I did the following:
>>
>> txBygene <- transcriptsBy(txdb, "gene")
>> geneID <- rep(names(txBygene), elementLengths(txBygene))
>> df <- data.frame(geneID=geneID,
>> txID=values(unlist(txBygene))[["tx_id"]])
>>
>> This gives me a dataframe with 40,780 rows with gene ID and txID
>> from txBygene object.
>> geneID txID
>> 40775 9994 11731
>> 40776 9994 11730
>> 40777 9997 38491
>> 40778 9997 38489
>> 40779 9997 38496
>> 40780 9997 38497
>>
>> Since my utr object is of length 33,381 ,my counts length is same
>> i.e 33,381
>> So I am not able to map the counts to the above data frame which
>> has transcript and gene IDs.
>>
>
> Yes, these lengths are different.
>
> In this example we have utr regions from 58 transcripts.
>
> > length(utr)
> [1] 58
>
>
> Those 58 transcripts can be matched to their gene ID's by looking
> at the txBygene object. All of the transcripts fall into one (or
> more) of 51 genes,
>
> > length(txBygene)
> [1] 51
>
> There are multiple transcripts per gene so we expand the gene ID's
> to map to the transcripts.
>
> > dim(df)
> [1] 79 2
>
> This data.frame has all transcripts from the txdb mapped to the
> gene ID's. Your utr data may contain only a subset of these
> transcripts. That is something you need to check. Match the
> desired transcript names to the df, pull out the gene IDs. You
> then have the gene ID's for your utr regions and can split or
> group your counts by gene.
>
> Valerie
>>
>>
>>
>> --- On *Fri, 23/9/11, Valerie Obenchain /<vobencha at fhcrc.org>
>> </mc/compose?to=vobencha at fhcrc.org>/*wrote:
>>
>>
>> From: Valerie Obenchain <vobencha at fhcrc.org>
>> </mc/compose?to=vobencha at fhcrc.org>
>> Subject: Re: [Bioc-sig-seq] Reads in 3'utr
>> To: "rohan bareja" <rohan_1925 at yahoo.co.in>
>> </mc/compose?to=rohan_1925 at yahoo.co.in>
>> Cc: bioc-sig-sequencing at r-project.org
>> </mc/compose?to=bioc-sig-sequencing at r-project.org>
>> Date: Friday, 23 September, 2011, 10:50 PM
>>
>> Hi Rohan,
>>
>> You can relate the counts for 3UTR regions to gene IDs
>> through the transcript IDs.
>>
>> txdb_file <- system.file("extdata",
>> "UCSC_knownGene_sample.sqlite", package="GenomicFeatures")
>> txdb <- loadFeatures(txdb_file)
>> utr=threeUTRsByTranscript(txdb,use.names=FALSE)
>>
>>
>> The transcript names can be matched to the gene ID's through,
>>
>> txBygene <- transcriptsBy(txdb, "gene")
>> geneID <- rep(names(txBygene), elementLengths(txBygene))
>> df <- data.frame(geneID=geneID,
>> txID=values(unlist(txBygene))[["tx_id"]])
>>
>> Now you know what gene ID each tx count belongs to. You can
>> split your counts by gene ID ...
>>
>>
>> Valerie
>>
>>
>>
>> On 09/20/2011 12:13 PM, rohan bareja wrote:
>>> Hi everyone,
>>> I am doing NGS analysis using bam files.I have counted reads in 3'utr region using
>>> utr=threeUTRsByTranscript(txdb,use.names=FALSE)
>>> countsUTR<- countOverlaps(utr,reads)
>>> I have got the transcript level counts from this.How can I get the gene level counts??It might sound silly but Does anybody have an idea on what type of anaylses we can do from this countsUTR ?
>>> Thanks,Rohan
>>> [[alternative HTML version deleted]]
>>>
>>>
>>>
>>> _______________________________________________
>>> Bioc-sig-sequencing mailing list
>>> Bioc-sig-sequencing at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>
More information about the Bioc-sig-sequencing
mailing list