[Bioc-devel] S4 initialize methods (was Re: "patches" for Gviz: utr plotting support and direct BamFile plotting)
Martin Morgan
mtmorgan at fhcrc.org
Wed Aug 22 21:37:44 CEST 2012
Steve and I exchanged a little email about S4 initialize methods that it
might help to share. Steve created an initialization method
setMethod("initialize", "BamTrack",
function(.Object, bam, cache=new.env(), range.strict=FALSE, ...) {
if (missing(bam) || !is(bam, "BamFile") || !file.exists(path(bam))) {
stop("bam required during initialize,BamTrack")
}
cache$bam <- bam
.Object at cache <- cache
callNextMethod(.Object=.Object, ...)
})
Steve tells me that this is similar to Gviz coding style. I think there
are several issues here.
The first is that creating a sub-class actually tries to create an
instance of the parent class, and to do that new("BamTrack") has to
succeed. It doesn't because 'bam' does not have a default value
> setClass("BamSubtrack", contains="BamTrack")
Error in .local(.Object, ...) : bam required during initialize,BamTrack
A second issue that comes up involves validation, which is what the
check for a missing(bam) etc., is. It makes more sense to place this in
the object validity method so that the code can be re-used, perhaps
providing a prototype to initialize the bam field properly.
While on validity and prototypes, a weird thing is that the class
definition can specify an invalid prototype and, since validity is only
checked if the user provides additional arguments to 'new' /
'initialize', it's possible to create invalid objects
setClass("A", representation(x="numeric"))
setValidity("A", function(object) {
if (length(object at x) != 1L) "'x' must be length 1" else NULL
})
and then
> a = new("A")
seems to work but
> validObject(a)
Error in validObject(a) : invalid class "A" object: 'x' must be length 1
the solution is to provide a prototype that creates a valid object
setClass("A", representation(x="numeric"), prototype=prototype(x=1))
and the acid test is validObject(new("A")) == TRUE
A third and even more obscure issue is that 'initialize' is advertised
to take unnamed arguments as instances of parent classes that are used
to initialize derived classes, so it makes sense to avoid accidentally
capturing un-named arguments by placing 'bam' and friends _after_ ...
Let's see...
setClass("A", representation(x="numeric"))
setClass("B", representation(y="numeric"), contains="A")
and then
> new("B", new("A", x=2))
An object of class "B"
Slot "y":
numeric(0)
Slot "x":
[1] 2
but...
setMethod(initialize, "B", function(.Object, y, ...)
callNextMethod(.Object, y=y, ...))
and now the copy constructor is broken.
> new("B", new("A", x=2))
Error in validObject(.Object) :
invalid class "B" object: invalid object for slot "y" in class "B":
got class "A", should be or extend class "numeric"
Another point about initialize as a copy constructor is that it updates
multiple slots in a (relatively) efficient way -- only 1 copy of the
object, rather than once for each slot assignment
removeMethod("initialize", "B")
and then
> b = new("B")
> tracemem(b)
[1] "<0x53d74e0>"
> b at x = 1
tracemem[0x53d74e0 -> 0x54096c8]:
> b at y = 2
tracemem[0x54096c8 -> 0x540b0c0]:
so a copy on each slot assignment, vs.
> b1 = new("B"); tracemem(b1)
[1] "<0x540e968>"
> initialize(b1, x=1, y=2)
tracemem[0x540e968 -> 0x541c628]: initialize initialize
An object of class "B"
Slot "y":
[1] 2
Slot "x":
[1] 1
Combined, these are enough to make one want to think very carefully
about writing initialize methods; often a 'Constructor' is the right
place to do argument coercion, etc., (although sometimes I think the
constructor is avoiding some of its responsibility, e.g., BamFile()
fails, but validObject(new("BamFile")) == TRUE) and validity methods the
correct place to check validity.
Martin
On 08/21/2012 11:12 PM, Steve Lianoglou wrote:
> Hi Florian (and other interested Gviz'ers),
>
> I thought I'd use Gviz to whip up pretty plots for my thesis (yay!)
> where I need to plot lots of NGS data over 3'UTRs.
>
> I wanted to tackle the "more standard" drawing of UTRs (thin exons
> (vs. thick coding)) in gene regions as well as making repeated
> plotting of the same data over different regions easier -- there is
> also an unplanned for increase in plotting speed of ~ 5-8x
> (unscientific benchmark) when plotting gene regions using my TxDbTrack
> vs. GeneRegionTrack.
>
> I have a more thorough summary of what I did here:
> http://cbio.mskcc.org/~lianos/files/bioc/Gviz/Gviz-enhancement-1.html
>
> With the relevant pics at the bottom. It's still a work in progress
> but I thought I'd put it out there now to see if you think it'd be
> useful for patching back into Gviz -- I'd be happy to groom things
> further to make it easier to add back into Gviz, or change things to
> make the approach more "inline" with the coding style/philosophy of
> the package (which I tried to stick to).
>
> Thanks (again) for this package -- it's really great.
>
> -steve
>
--
Computational Biology / Fred Hutchinson Cancer Research Center
1100 Fairview Ave. N.
PO Box 19024 Seattle, WA 98109
Location: Arnold Building M1 B861
Phone: (206) 667-2793
More information about the Bioc-devel
mailing list