[R] alternative to rbind for data.table

R. Michael Weylandt michael.weylandt at gmail.com
Sat Jul 21 17:05:29 CEST 2012


You need to preallocate. [See the R Inferno]

n <- 10000
dt <- as.data.table(matrix(0, nrow = n, ncol = 5))
for(i in seq_len(n)){
    dt[i, ] <- ### Whatever
}

Note that this *might* be somewhat trickier with data.tables and their
fancy indexing. It might be easier to simply assign into the matrix
and then coerce that into a data.table when you're done.

Alternatively, if your simulation can be run in parallel, you might
try something like this:

library(parallel)
# Set up back end if needed

do.call(rbind, mclapply(seq_len(n), function(n) do_simulation()))

Best,
Michael


On Sat, Jul 21, 2012 at 6:21 AM, Christof Kluß <ckluss at email.uni-kiel.de> wrote:
> Hi
>
> I want to add a row to a "data.table" in each round of a for loop.
> "rbind" seems to be a inefficient way to implement this.
>
> How would you do this? The "slow" solution:
>
> library(data.table)
> Rprof("test.out")
> dt <- data.table()
>
> for (i in (1:10000)) {
>   # algorithm that generates a list with different values,
>   # but same key-names, each round, for example
>   l <- list(A=1,B=2,C=3,E=4,F=5)
>   dt <- rbind(dt,l) # very slow :(
> }
>
> Rprof(NULL)
> summaryRprof("test.out")
>
> thx
> Christof
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list