[R] aggregate.data.frame with NAs and different types
Spencer Graves
spencer.graves at structuremonitoring.com
Mon May 13 00:30:58 CEST 2013
Hi, Arun: Thanks. That's exactly what I need. Spencer
fortune(298)
Don't do as I say, do as Hadley does.
-- Barry Rowlingson (in a discussion about the workflow for writing R
packages, see also fortune(128))
R-devel (September 2011)
On 5/12/2013 2:25 PM, arun wrote:
>
> HI,
>
> Try:
> library(plyr)
> res1<-ddply(df2aggregate,.(id),summarize,x=sum(x),y=mean(y),a=head(a,1))
> res1
> # id x y a
> #1 a 3 NA <NA>
> #2 b 7 2.5 A
> #3 c 11 4.5 C
> #4 d NA NA E
> res1$x<- as.numeric(res1$x)
> identical(ag1.2,res1)
> #[1] TRUE
> A.K.
>
>
> ----- Original Message -----
> From: Spencer Graves <spencer.graves at structuremonitoring.com>
> To: R list <R-help at r-project.org>
> Cc:
> Sent: Sunday, May 12, 2013 4:54 PM
> Subject: [R] aggregate.data.frame with NAs and different types
>
> Hello:
>
>
> Do you have suggestions for how to aggregate a data.frame using
> different functions on different columns?
>
>
> Consider the following example:
>
>
> df2aggregate <- data.frame(id=rep(letters[1:4], each=2),
> x =c(1:6, NA, NA),
> y =c(NA, 1:6, NA),
> a =c(NA, NA, LETTERS[1:6]),
> stringsAsFactors=FALSE)
>
> # Desired output:
>
> ag1.2 <- data.frame(id=letters[1:4],
> x =c(3, 7, 11, NA),
> y =c(NA, 2.5, 4.5, NA),
> a =c(NA, 'A', 'C', 'E'),
> stringsAsFactors=FALSE)
>
>
> I'm thinking of writing a function Aggregate(x, by, FUN, ...),
> where x = data.frame, by = vector of names of columns of x, and FUN =
> function that would accept as input a data.frame subset of x and would
> return a data.frame FUNout, which would be combined using cbind(x[, by],
> FUNout), then rbind over all such subset data.frames. However, before I
> write this, I'd like to make sure it doesn't already exist. My current
> plan is to add it to the Ecdat package.
>
>
> Suggestions? Should I study "plyr"? fortune(298) ;-)
>
>
> Thanks,
> Spencer
>
>
> p.s. library(sos); findFn('aggregate.data.frame') returned 4 matches,
> none of which seemed to solve this problem. findFn('aggregate
> data.frame') returned 133 matches in 71 package. findFn('aggregate')
> returned 734 matches in 282 packages. I failed to find anything useful
> in the latter two and with other attempts using RSiteSearch, except for
> a reference to plyr.
More information about the R-help
mailing list