[Bioc-sig-seq] srapply / as.list / multicore weird interaction

Martin Morgan mtmorgan at fhcrc.org
Wed Apr 20 05:22:56 CEST 2011


On 04/19/2011 05:59 PM, Janet Young wrote:
> Hi,
>
> I'm learning how to use srapply - it looks like it'll be really
> useful for me.
>
> I think I might have found a bug, or at least behavior that's a bit
> odd.  I've written an srapply function on a DNAStringSet.  My goal is
> to put a wrapper on pairwiseAlign to align a bunch of shortReads to
> several subject sequences of interest and return a list of all the
> scores. After a struggle I've got it working but I needed to work
> around what might be a bug(?)
>
> I've got a toy example too (see below). This example (and my real
> function) work fine when multicore is not loaded (I know, the whole
> point of srapply is to use multicore, right, but I was trying to
> track down the problem). However, it doesn't work after I load
> multicore, because as.list now seems to struggle on the DNAStringSet.
> It's a problem whether I load ShortRead or multicore first.   If I
> force as.list on my DNAStringSet object, it works - but should I have
> to do that?
>
> I think I've included all you need to know below (that was on my Mac,
> but I get the same thing on 64-bit linux). Does this make any sense
> to you?
>
> thanks,
>
> Janet
>
>
>
>
> R version 2.13.0 (2011-04-13) Copyright (C) 2011 The R Foundation for
> Statistical Computing ISBN 3-900051-07-0 Platform:
> i386-apple-darwin9.8.0/i386 (32-bit)
>
> R is free software and comes with ABSOLUTELY NO WARRANTY. You are
> welcome to redistribute it under certain conditions. Type 'license()'
> or 'licence()' for distribution details.
>
> Natural language support but running in an English locale
>
> R is a collaborative project with many contributors. Type
> 'contributors()' for more information and 'citation()' on how to cite
> R or R packages in publications.
>
> Type 'demo()' for some demos, 'help()' for on-line help, or
> 'help.start()' for an HTML browser interface to help. Type 'q()' to
> quit R.
>
> [R.app GUI 1.40 (5751) i386-apple-darwin9.8.0]
>
>> library(ShortRead)
> Loading required package: IRanges
>
> Attaching package: 'IRanges'
>
> The following object(s) are masked from 'package:base':
>
> cbind, eval, intersect, Map, mapply, order, paste, pmax, pmax.int,
> pmin, pmin.int, rbind, rep.int, setdiff, table, union
>
> Loading required package: GenomicRanges Loading required package:
> Biostrings Loading required package: lattice Loading required
> package: Rsamtools
>>
>> testseqs<- DNAStringSet(c("CTCGACCAGTAT", "TTGAGGCTGT"))
>> names(testseqs)<- c("seq1","seq2")
>>
>>
>> testfunction2<- function (myseqs) {
> +     srapply(myseqs, function(x, ... ) { class(x) } ) + }
>> testfunction2(testseqs)
> $seq1 [1] "DNAString" attr(,"package") [1] "Biostrings"
>
> $seq2 [1] "DNAString" attr(,"package") [1] "Biostrings"
>
>>
>>
>>
>> library(multicore)
>
> Attaching package: 'multicore'
>
> The following object(s) are masked from 'package:lattice':
>
> parallel
>
>>
>> testfunction2<- function (myseqs) {
> +     srapply(myseqs, function(x, ... ) { class(x) } ) + }
>> testfunction2(testseqs)
> $seq1 [1] "Error in as.list.default(X) : \n  no method for coercing
> this S4 class to a vector\n"
>
> $seq2 [1] "Error in as.list.default(X) : \n  no method for coercing
> this S4 class to a vector\n"


Hi Janet --

actually,

   mclapply(testseqs, identity)

and

   base::lapply(testseqs, identity)

cause problems too (identity is a convenient do-nothing function; your 
testfunction2 etc wrappers aren't strictly necessary to reproduce the 
problem).

The definition of lapply is

 > base::lapply
function (X, FUN, ...)
{
     FUN <- match.fun(FUN)
     if (!is.vector(X) || is.object(X))
         X <- as.list(X)
     .Internal(lapply(X, FUN))
}

where we see as.list and guess, correctly, that

 > base::as.list(testseqs)
Error in as.list.default(testseqs) :
   no method for coercing this S4 class to a vector

I think the solution is, in IRanges, to define an S3 method as.list.List 
(DNAStringSet inherits from List) and to export S3method(as.list, List). 
We'll see what the IRanges say about this...

Martin


>>
>>
>> testfunction3<- function (myseqs) {
> +     srapply(as.list(myseqs), function(x, ... ) { class(x) } ) + }
>> testfunction3(testseqs)
> $seq1 [1] "DNAString" attr(,"package") [1] "Biostrings"
>
> $seq2 [1] "DNAString" attr(,"package") [1] "Biostrings"
>
>>
>> sessionInfo()
> R version 2.13.0 (2011-04-13) Platform: i386-apple-darwin9.8.0/i386
> (32-bit)
>
> locale: [1] en_US.UTF-8/en_US.UTF-8/C/C/en_US.UTF-8/en_US.UTF-8
>
> attached base packages: [1] stats     graphics  grDevices utils
> datasets  methods   base
>
> other attached packages: [1] multicore_0.1-5     ShortRead_1.10.0
> Rsamtools_1.4.0     lattice_0.19-23 [5] Biostrings_2.20.0
> GenomicRanges_1.4.0 IRanges_1.10.0
>
> loaded via a namespace (and not attached): [1] Biobase_2.12.0
> grid_2.13.0    hwriter_1.3
>
> _______________________________________________ Bioc-sig-sequencing
> mailing list Bioc-sig-sequencing at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing


-- 
Computational Biology
Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N. PO Box 19024 Seattle, WA 98109

Location: M1-B861
Telephone: 206 667-2793



More information about the Bioc-sig-sequencing mailing list