[R] mean of subset of rows
John Kane
jrkrideau at yahoo.ca
Mon Oct 1 18:42:33 CEST 2007
--- darteta001 at ikasle.ehu.es wrote:
> Dear list,
> this must be an easy one:
>
> I have a data.frame of two columns, "ID" with four
> different levels (A
> to D) and numerical "size", and each of the 4
> different IDs is
> repeated a
> different number of times. I would like to get the
> mean size for each
> ID as another data.frame. I have tried the
> following:
>
> >ID= as.character(unique(data[,1])) # I use unique()
> because "data"
> will be larger in future
> >nIDs = length(ID)
> >for(i in 1:nIDs){
> + subdata = subset(data,V1==ID[i])
> + average =
> as.data.frame(cbind(1:i,ID[i],mean(subdata[,2]))
> + }
>
dfnames <- c("id","v1")
mydata <- data.frame(id <-as.factor( c("a","a","b",
"c","c", "b")),
v1 <- c(2,3,3,2,2,4) )
names(mydata) <- dfnames
mydata
mysums <-aggregate(mydata[2], id, mean)
names(mysums) <- dfnames
mysums
I am not exactly sure what is happening in that loop
but you have no place to store the results of each
iteration.
This loop should work but you are much better off to
use the aggregate command. For loops are not liked in
R. Good luck.
data <- mydata
ID= as.character(unique(data[,1]))
nIDs = length(ID)
average <- matrix(NA, nrow=nIDs, ncol=1)
for(i in 1:nIDs){
subdata = subset(data,id==ID[i])
average[i] = mean(subdata[,2])
}
average
newdata <- data.frame(ID,average)
names(newdata) <- dfnames
newdata
> Unfortunately, my output only gets the last level of
> ID four times:
> >average
> V1 V2 V3
> 1 1 D 179.777777777778
> 2 2 D 179.777777777778
> 3 3 D 179.777777777778
> 4 4 D 179.777777777778
>
> How can I get what I need? there might be an easier
> way to do it, but
> I guess my skills aren´t that good. Any suggestions
> are welcome
>
> Regards,
>
> David
More information about the R-help
mailing list