[Bioc-sig-seq] Rsamtools countBam labeling
Matthew Young
myoung at wehi.EDU.AU
Tue Apr 20 02:30:48 CEST 2010
Hi,
I'm trying to use Rsamtools to bin reads into genes using the countBam
function. I have a GRanges object which defines the range of the
annotation and has additional information in the "elementMetadata"
field, for example like this:
> aa
GRanges with 3 ranges and 2 elementMetadata values
seqnames ranges strand | tx_id tx_name
<Rle> <IRanges> <Rle> | <integer> <character>
[1] chr16 [18780546, 18811724] - | 43866 ENSMUST00000115586
[2] chr16 [18780546, 18812065] - | 43865 ENSMUST00000000028
[3] chr16 [18807449, 18812080] - | 43868 ENSMUST00000115585
I want to use the countBam function to count the number of reads that
occur within each annotation range, which I do with the following:
> countBam("bowtie_aligned.prefix.bam",param=ScanBamParam(which=aa))
space start end width file records
nucleotides
1 chr16 18780546 18811724 31179 bowtie_aligned.prefix.bam
16 800
2 chr16 18780546 18812065 31520 bowtie_aligned.prefix.bam
17 850
3 chr16 18807449 18812080 4632 bowtie_aligned.prefix.bam
2 100
My question is, is there a way to get the "elementMetadata" to come
along for the ride in the countBam output? So there'd be another two
columns in the countBam output, "tx_id" and "tx_name". The problem is
that the ranges do not always appear in the same order in the GRanges
input and the countBam output, so to recover this information it is
necessary to use match and pull information out of the GRanges object,
which while doable, is cumbersome. Is there some built in way to do
this that I'm overlooking?
Thanks,
Matt
______________________________________________________________________
The information in this email is confidential and intend...{{dropped:4}}
More information about the Bioc-sig-sequencing
mailing list