[Bioc-sig-seq] gdapply, coverage, and widths for all chromosomes
Martin Morgan
mtmorgan at fhcrc.org
Mon Oct 19 22:01:12 CEST 2009
Michael Lawrence wrote:
> On Mon, Oct 19, 2009 at 10:25 AM, Chris Seidel <seidel at phaget4.org> wrote:
>
>> In using gdapply to get coverage over all chromosomes of a
>> GenomeDataList object:
>>
>> cov <- gdapply(extendedReads, coverage)
a different route to this point (<...> representing arguments you'd
provide) is
aln <- readAligned(<...>) # or aln <- AlignedRead(<...>)
library(BSgenome.Hsapiens.UCSC.hg18)
wd <- seqlengths(Hsapiens)
## subset aln and wd so that they have matching names, e.g.,
## it might also be necessary to adjust names, e.g,. dropping '.fa'
aln <- aln[chromosome(aln) %in% names(wd)]
wd <- wd[names(wd) %in% chromosome(aln)]
cvg <- coverage(aln, width=wd, extend=200L)
this also results in a list of Rle's (a SimpleRleList). See
?"AlignedRead-class" for details; this is a non-trivial operation, e.g.,
it maybe useful to separately manage strands.
Martin
>>
>> how do you supply the "width" argument to coverage? When I try handing
>> it a vector of chromosome lengths, it complains that it needs a single
>> value.
>>
>>
> This is the normal behavior of apply functions in R. The additional
> arguments are passed in whole to the function every time it is invoked. We
> don't have an "mapply" variant of gdapply, so you'll need to do some extra
> work, like keeping track of an index.
>
> Perhaps we could make this easier using metadata? We know the genome of
> 'extendedReads' so 'coverage' could be smart and refer to the appropriate
> BSgenome package.
>
> Michael
>
>
>
>> -Chris
>>
>> _______________________________________________
>> Bioc-sig-sequencing mailing list
>> Bioc-sig-sequencing at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
>>
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-sig-sequencing mailing list
> Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
--
Martin Morgan
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioc-sig-sequencing
mailing list