[R] How to run lm for each subset of the data frame, and then aggreage the result?
David Winsemius
dwinsemius at comcast.net
Sun May 19 18:19:01 CEST 2013
On May 19, 2013, at 5:31 AM, CHEN, Cheng wrote:
> Hi gurus,
>
> I have a big data frame df, with columns named as :
>
> age, income, country
>
> what I want to do is very simpe actually, do
>
> fitFunc<-function(thisCountry){
> subframe<-df[which(country==thisCountry),];
> fit<-lm(income~0+age, data=subframe);
> return(coef(fit));}
>
> for each individual country. Then aggregate the result into a new data
> frame looks like :
>
> countryname, coeffname1 USA 1.22 GB
> 1.03 France 1.1
>
> I tried to do :
> do.call("rbind", lapply(countries, fitFunc))
>
This suggests you have used 'attach' on df. Not a safe practice.
> but this only gives something like:
>
> age
> [1,] 2.540879
> [2,] 2.428830
> [3,] 2.369560
> How should I proceed?
That is exactly the sort of result I would have expected from your procedure. We cannot tell what you want that is different. For one thing you are posting in HTML so the "aggregate result above is mangled. I'm guessing it might have been.
countryname, coeffname1
USA 1.22
GB 1.03
France 1.1
So perhaps the only thing that is missing are the row names?
res <- do.call("rbind", lapply(df$countries, fitFunc)
rownames(res) <- as.character(df$countries)
res
If you had wanted a dataframe to be returned you could do this with the 'by' function or return a list with countries instead of a numeric vector from your 'fitFunc' calls. rbind a list of lists may give you something that should easily be coerced to data.frame. (But no data to test these theories)
>
> [[alternative HTML version deleted]]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> and provide commented, minimal, self-contained, reproducible code.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
--
David Winsemius
Alameda, CA, USA
More information about the R-help
mailing list