[R-sig-hpc] mclapply: rm intermediate objects and returning memory
Simon Urbanek
simon.urbanek at r-project.org
Mon Oct 15 18:17:20 CEST 2012
Ramon,
On Oct 15, 2012, at 10:47 AM, Ramon Diaz-Uriarte wrote:
> Dear All,
>
>
> It seems that, in some cases, mclapply can result in out of memory
> conditions that can be avoided by having the "X" argument split into smaller
> pieces, and running mclapply several times. This is an example (on a
> machine with 12 cores and 64 Gb of RAM [similar examples can be created
> for different numbers of cores and RAM]):
>
>
> #########################
> library(parallel)
> data.500 <- matrix(1.1, nrow = 3*10^6, ncol = 500)
> print(object.size(data.500), units = "Mb")
>
> f1 <- function(index, data) {
> x <- data[, index]
> u <- 2 * x
> return(u)
> }
>
> ## This will not run
> tmp1 <- mclapply(1:500, f1, data.500,
> mc.cores = detectCores())
>
>
> ## These will run
>
>
> tmp1a <- mclapply(1:100, f1, data.500,
> mc.cores = detectCores())
> tmp1b <- mclapply(101:200, f1, data.500,
> mc.cores = detectCores())
> tmp1c <- mclapply(201:300, f1, data.500,
> mc.cores = detectCores())
> tmp1d <- mclapply(301:400, f1, data.500,
> mc.cores = detectCores())
> tmp1e <- mclapply(401:500, f1, data.500,
> mc.cores = detectCores())
>
> tmp <- c(tmp1a, tmp1b, tmp1c, tmp1d, tmp1e)
>
> ########################
>
> Notice that the problem is not simply memory usage by the master, since
> the last concatenation of the five lists works.
>
>
> However, if we delete intermediate objects as in:
>
> ##############
>
> f1 <- function(index, data) {
> x <- data[, index]
> u <- 2 * x
> rm(x); gc() ### This is the only change
> return(u)
> }
> #############
>
> then it does run.
>
>
>
>
> It also runs if we use the original function (i.e., we do not delete
> intermediate results) but change the scheduling:
>
>
> #################
>
> f1 <- function(index, data) {
> x <- data[, index]
> u <- 2 * x
> return(u)
> }
>
> tmp1 <- mclapply(1:500, f1, data.500,
> mc.cores = detectCores(),
> mc.preschedule = FALSE)
>
> ###################
>
>
>
>
> So it seems that, with preschedule = TRUE, each slave does not return the
> memory to the OS until all the jobs are collected by the master. Before
> that, in terms of memory, it is as if a list as long as X is kept, but not
> just with the return objects, but also with all the intermediate (and non
> deleted) objects (i.e., it is as if each function invocation has not
> really fully returned?).
>
That should not be the case since x goes out of scope, it is not stored by mclapply (NB: each job is a simple lapply()). However, you may be running into something else: note that the jobs are all independent, so they are not aware of each other's memory usage. One job cannot trigger garbage collection in another one. So what probably happens is that the faster jobs feel just fine, because they are not running out of memory so they don't trigger garbage collection. However, another job may be strapped for memory, it will run its own garbage collection, but that won't free enough memory. It cannot trigger gc in the other job, so it is stuck. By forcing gc() you're making sure that all jobs will be running the garbage collector and thus it's less likely that they will push each other out of memory.
> In the help file, mc.preschedule = TRUE is the recommended setting for
> large number of values in X (and in my experiments setting it to FALSE
> makes things slower). I guess, then, that the recommended way of dealing
> with these issues is carefully deleting (and gc'ing) any non-needed
> intermediate result? Is this correct?
>
The intermediate result itself is harmless, it will be collected eventually, but not necessarily right after the function returns. However, depending on your machine's memory usage that delay may be enough to trigger the above. I cannot replicate your problem on my machine, so try just adding gc() alone - it should make sure that at least all temporary objects from previous iterations are gone and I would hope that it solves your problem. If you are on the edge with your memory usage, you may want to run something like {x <- local({ ... }); gc(); x} - but in your case it should not be necessary since the temporary objects are small per iteration.
Re. your suggestion, I agree that mclapply offers two extremes: either work size of 1 or n/cores, nothing in between. For some applications it may be beneficial to use other sizes - if someone would be willing to give it a shot, I could review it.
Cheers,
Simon
>
>
> Best,
>
> R.
>
>
> --
> Ramon Diaz-Uriarte
> Department of Biochemistry, Lab B-25
> Facultad de Medicina
> Universidad Autónoma de Madrid
> Arzobispo Morcillo, 4
> 28029 Madrid
> Spain
>
> Phone: +34-91-497-2412
>
> Email: rdiaz02 at gmail.com
> ramon.diaz at iib.uam.es
>
> http://ligarto.org/rdiaz
>
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
More information about the R-sig-hpc
mailing list