[R] reshaping data frame
Chuck Cleland
ccleland at optonline.net
Wed Feb 20 20:03:40 CET 2008
On 2/20/2008 1:14 PM, ahimsa campos-arceiz wrote:
> Dear all,
>
> I'm having a few problems trying to reshape a data frame. I tried with
> reshape{stats} and melt{reshape} but I was missing something. Any help is
> very welcome. Please find details below:
>
> #################################
> # data in its original shape:
>
> indiv <- rep(c("A","B"),c(10,10))
> level.1 <- rpois(20, lambda=3)
> covar.1 <- rlnorm(20, 3, 1)
> level.2 <- rpois(20, lambda=3)
> covar.2 <- rlnorm(20, 3, 1)
> my.dat <- data.frame(indiv,level.1,covar.1,level.2,covar.2)
>
> # the values of level.1 and level.2 represent the number of cases for the
> particular
> # combination of indiv*level*covar value
>
> # I would like to do two things:
> # 1. reshape to long reducing my.dat[,2:5] into two colums "factor" (levels=
> level.1 & level.2)
> # and the covariate
> # 2. create one new row for each case in level.1 and level.2
>
> # the new reshaped data.frame would should look like this:
>
> # indiv factor covar case.id
> # A level.1 4.614105 1
> # A level.1 4.614105 2
> # A level.2 31.064405 1
> # A level.2 31.064405 2
> # A level.2 31.064405 3
> # A level.2 31.064405 4
> # A level.1 19.185784 1
> # A level.2 48.455929 1
> # A level.2 48.455929 2
> # A level.2 48.455929 3
> # etc...
>
> #############################
Maybe there is a better way, but this seems to do what you want:
#################################
# data in its original shape:
indiv <- rep(c("A","B"),c(10,10))
level.1 <- rpois(20, lambda=3)
covar.1 <- rlnorm(20, 3, 1)
level.2 <- rpois(20, lambda=3)
covar.2 <- rlnorm(20, 3, 1)
my.dat <- data.frame(indiv,level.1,covar.1,level.2,covar.2)
long <- reshape(my.dat, varying = list(c("level.1","level.2"),
c("covar.1","covar.2")),
timevar="level", idvar="case.id",
v.names=c("ncases","covar"),
direction="long")
newdf <- with(long, data.frame(indiv = rep( indiv, ncases),
level = rep( level, ncases),
covar = rep( covar, ncases),
case.id = rep(case.id, ncases)))
The idea is to first reshape() and then rep() each variable ncases
times. You can then convert newdf$level into a factor if you like.
> Thank you very much!!
>
> Ahimsa
--
Chuck Cleland, Ph.D.
NDRI, Inc.
71 West 23rd Street, 8th floor
New York, NY 10010
tel: (212) 845-4495 (Tu, Th)
tel: (732) 512-0171 (M, W, F)
fax: (917) 438-0894
More information about the R-help
mailing list