[R] help with reshaping data into long format (correct question)

Henrique Dallazuanna wwwhsd at gmail.com
Wed Jan 16 13:44:33 CET 2008


try this:

x[6, which(x[5,]=="y")] <- "y"
levels(x$id) <- c(levels(x$id)[drop=T], "treat")
x <- x[-5,]
x[5, "id"] <- "treat"
levels(x$id) <- gsub("^ques", "", levels(x$id))
x3 <- as.data.frame(t(x[,-1]))
names(x3) <- x$id

foo <- function(x, ...)
{
tmp <- as.numeric(as.character(unlist(x[,grep("_", names(x), value=T)])))
y <- x[,c("disease", "age", "city", "sex", "treat")][rep(1,length(tmp)),]
newdf <- data.frame(y, quess=grep("_", names(x), value=T), value=tmp)
return(newdf)
}

 do.call(rbind, lapply(x4, foo))


On 15/01/2008, Tom Cohen <tom.cohen78 at yahoo.se> wrote:
>
>   Dear list,
>   I have the following data set
>
> id       1  2  3  4  5  6  7  8  9  10
> disease  a  b  c  d  e  f  g  h  i   j
> age     23 40 32 34 25 32 22 35 29  21
> city    NY LD NY SG NY LD VG SA LD  SG
> sex      1  1  2  2  2  2  1  1  1   2
> treat_a           y  y  y        y
> treat_b  n  n  n           n  n      n
> ques1_1  2  4  5  6  8  3  1  2  4   5
> ques1_2  6  4  5 12 10  9  8  4  5   7
> ques1_3 17 23 32 25 14 24 23 22 32  29
> ques2_1  4  7  9 10  6  8  5  7  8   9
> ques2_2  8  9 10 12 17 19 14 21 22  19
> ques2_3 23 18 19 20 23 24 26 28 29  22
> ques3_1  5  7  9  1  4  7  9  8 10   5
> ques3_2 34 35 32 23 31 29 27 25 32  33
> ques3_3 29 33 27 25 27 23 24 29 27  24
>
> where the first row is the header row in a dataframe. First I want to merge the two variables
> treat_a and treat_b to a new variable called "treat" which will be given n if it's left blank
> in the variable treat_a and y if it's left blank in treat_b. The new data set will look like
>   id       1  2  3  4  5  6  7  8  9  10
> disease  a  b  c  d  e  f  g  h  i   j
> age     23 40 32 34 25 32 22 35 29  21
> city    NY LD NY SG NY LD VG SA LD  SG
> sex      1  1  2  2  2  2  1  1  1   2
> treat    n  n  n  y  y  y  n  n  y   n
> ques1_1  2  4  5  6  8  3  1  2  4   5
> ques1_2  6  4  5 12 10  9  8  4  5   7
> ques1_3 17 23 32 25 14 24 23 22 32  29
> ques2_1  4  7  9 10  6  8  5  7  8   9
> ques2_2  8  9 10 12 17 19 14 21 22  19
> ques2_3 23 18 19 20 23 24 26 28 29  22
> ques3_1  5  7  9  1  4  7  9  8 10   5
> ques3_2 34 35 32 23 31 29 27 25 32  33
> ques3_3 29 33 27 25 27 23 24 29 27  24
>   Now I want to reshape the data in a long format with target output
>
>   id disease age city sex treat ques ques_value
> 1 a   23 NY    1   n     1_1 2
> 1 a   23 NY    1   n     1_2 6
> 1 a   23 NY    1   n     1_3 17
> 1 a   23 NY    1   n     2_1 4
> 1       a   23 NY    1   n     2_2 8
> 1       a   23 NY    1   n     2_3 23
> 1 a   23 NY    1   n     3_1 5
> 1 a   23 NY    1   n     3_2 34
> 1 a   23 NY    1   n     3_3 29
> 2 b   40 LD    1   n     1 _1 4
> 2 b   40 LD    1   n     1 _2 4
> 2 b   40 LD    1   n     1 _3 23
> 2 b   40 LD    1   n     2_1 7
> 2 b   40 LD    1   n     2_2 9
> 2 b   40 LD    1   n     2_3 18
> 2 b   40 LD    1   n     3_1 7
> 2 b   40 LD    1   n     3_2 35
> 2 b   40 LD    1   n     3_3 33
> ..
> ..
> ..
>   10     j   21 SG    2   n     3_3 24
>   How can I do this in R?
>   Thanks alot for any help,
>   Tom
>
>
> ---------------------------------
>
> Jämför pris på flygbiljetter och hotellrum: http://shopping.yahoo.se/c-169901-resor-biljetter.html
>         [[alternative HTML version deleted]]
>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>


-- 
Henrique Dallazuanna
Curitiba-Paraná-Brasil
25° 25' 40" S 49° 16' 22" O




More information about the R-help mailing list