[Rd] reshape scaling with large numbers of times/rows
Mitch Skinner
mitch at gallo.ucsf.edu
Thu Aug 24 16:08:16 CEST 2006
On Thu, 2006-08-24 at 08:57 -0400, Gabor Grothendieck wrote:
> If your Z in reality is not naturally numeric try representing it as a
> factor and using
> the numeric levels as your numbers and then put the level labels back on:
>
> m <- n <- 5
> DF <- data.frame(X = gl(m*n, 1), Y = gl(m, n), Z = letters[1:25])
> Zn <- as.numeric(DF$Z)
> system.time(w1 <- reshape(DF, timevar = "X", idvar = "Y", dir = "wide"))
> system.time({Zn <- as.numeric(DF$Z)
> w2 <- xtabs(Zn ~ Y + X, DF)
> w2[w2 > 0] <- levels(DF$Z)[w2]
> w2[w2 == 0] <- NA
> })
This is pretty slick, thanks. It looks like it works for me. For the
archives, this is how I got back to a data frame (as.data.frame(w2)
gives me a long version again):
> m <- 4500
> n <- 70
> DF <- data.frame(X = gl(m, n), Y = 1:n, Z = letters[1:25])
> system.time({Zn <- as.numeric(DF$Z)
+ w2 <- xtabs(Zn ~ Y + X, DF)
+ w2[w2 > 0] <- levels(DF$Z)[w2]
+ w2[w2 == 0] <- NA
+ WDF <- data.frame(Y=dimnames(w2)$Y)
+ for (col in dimnames(w2)$X) { WDF[col]=w2[,col] }
+ })
[1] 131.888 1.240 135.945 0.000 0.000
> dim(WDF)
[1] 70 4501
I'll have to look; maybe I can just use w2 as is. Next time I guess
I'll try R-help first.
Thanks again,
Mitch
More information about the R-devel
mailing list