[R] mean of subset of rows
Jeffrey Robert Spies
jspies at nd.edu
Mon Oct 1 18:42:50 CEST 2007
You were on the right track with the for loop, but often you can do
the same thing looplessly (I know, it's not really a word) in R:
If your data is like this:
data<-data.frame(ID=rep(letters[1:4], 5), size=runif(20))
then apply either
tapply(data$size, data$ID, mean)
or
aggregate(data$size, list(data$ID), mean)
For further reference, section 4.2 in "An Introduction to R"
describes using tapply in this way.
Jeff.
On Oct 1, 2007, at 11:57 AM, <darteta001 at ikasle.ehu.es>
<darteta001 at ikasle.ehu.es> wrote:
> Dear list,
> this must be an easy one:
>
> I have a data.frame of two columns, "ID" with four different levels (A
> to D) and numerical "size", and each of the 4 different IDs is
> repeated a
> different number of times. I would like to get the mean size for each
> ID as another data.frame. I have tried the following:
>
>> ID= as.character(unique(data[,1])) # I use unique() because "data"
> will be larger in future
>> nIDs = length(ID)
>> for(i in 1:nIDs){
> + subdata = subset(data,V1==ID[i])
> + average = as.data.frame(cbind(1:i,ID[i],mean(subdata[,2]))
> + }
>
> Unfortunately, my output only gets the last level of ID four times:
>> average
> V1 V2 V3
> 1 1 D 179.777777777778
> 2 2 D 179.777777777778
> 3 3 D 179.777777777778
> 4 4 D 179.777777777778
>
> How can I get what I need? there might be an easier way to do it, but
> I guess my skills aren´t that good. Any suggestions are welcome
>
> Regards,
>
> David
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-
> guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list