[R] Strange output daply with empty strata
    Jan van der Laan 
    rhelp at eoos.dds.nl
       
    Thu Sep  9 11:43:02 CEST 2010
    
    
  
Dear list,
I get some strange results with daply from the plyr package. In the  
example below, the average age per municipality for employed en  
unemployed is calculated. If I do this using tapply (see code below) I  
get the following result:
         no      yes
A       NA 36.94931
B 51.22505 34.24887
C 48.05759 51.00198
If I do this using daply:
municipality       no      yes
            A 36.94931 48.05759
            B 51.22505 51.00198
            C 34.24887       NA
daply generates the same numbers. However, these are not in the  
correct cells. For example, in municipality A everybody is employed.  
Therefore, the NA should be in the cell for unemployed in municipality  
A.
Am I using daply incorrectly or is there indeed something wrong with  
the output of daply?
Regards,
Jan
I am using version 1.1 of the plyr-package.
# Generate some test data
data.test <- data.frame(
   municipality=rep(LETTERS[1:3], each=10),
   employed=sample(c("yes", "no"), 30, replace=TRUE),
   age=runif(30,20,70))
# Make sure everybody is employed in municipality A
data.test$employed[data.test$municipality == "A"] <- "yes"
# Compare the output of tapply:
tapply(data.test$age, list(data.test$municipality, data.test$employed),
mean)
# to that of daply:
daply(data.test, .(municipality, employed), function(d){mean(d$age)} )
# results of ddply are the samen as tapply
ddply(data.test, .(municipality, employed), function(d){mean(d$age)} )
    
    
More information about the R-help
mailing list