[R] apply formula over columns by subset of rows in a dataframe (to get a new dataframe)
    Massimo Bressan 
    massimo.bressan at arpa.veneto.it
       
    Sat May 14 10:44:54 CEST 2016
    
    
  
thank you, what a nice compact solution with ave() 
I learned something new about the subtleties of R 
let me here summarize the alternative solutions, just in case someonelse might be interested... 
thanks, bye 
# 
# my user function (an example) 
mynorm <- function(x) {(x - min(x, na.rm=TRUE))/(max(x, na.rm=TRUE) - min(x, na.rm=TRUE))} 
# my dataframe to apply the formula by blocks 
mydf<-data.frame(blocks=rep(c("a","b","c"),each=5), v1=round(runif(15,10,25),0), v2=round(rnorm(15,30,5),0)) 
# blocks (factors) to be used for splitting 
b <- mydf$blocks 
# 1 - split-lapply-unsplit with anonimous function to return a new df 
s <- split(mydf, b) 
l<- lapply(s, function(x) data.frame(x, v1mod=mynorm(x$v1))) 
mydf_new <- unsplit(l, mydf$blocks) 
# 2 - split-lapply-unsplit with function trasnform to return a new df 
l <- split(mydf, b) 
l <- lapply(l, transform, v1.mod = mynorm(v1)) 
mydf_new <- unsplit(l, b) 
# 3 - ave() encapsulating split-lapply-unsplit approach 
mydf_new<-transform(mydf, v1.mod = ave(v1, blocks, FUN=mynorm)) 
# 
Da: "William Dunlap" <wdunlap at tibco.com> 
A: "Massimo Bressan" <massimo.bressan at arpa.veneto.it> 
Cc: "David L Carlson" <dcarlson at tamu.edu>, "r-help" <r-help at r-project.org> 
Inviato: Venerdì, 13 maggio 2016 19:22:21 
Oggetto: Re: [R] apply formula over columns by subset of rows in a dataframe (to get a new dataframe) 
ave() encapsulates the split/lapply/unsplit stuff so 
transform(mydf, v1.mod = ave(v1, blocks, FUN=mynorm)) 
also gives what you got above. 
Bill Dunlap 
TIBCO Software 
wdunlap tibco.com 
On Fri, May 13, 2016 at 7:44 AM, Massimo Bressan < massimo.bressan at arpa.veneto.it > wrote: 
yes, thanks 
you pointed me in the right direction: split/unplist was the trick 
I completely left behind that possibility! 
here the final version 
############ 
mynorm <- function(x) {(x - min(x, na.rm=TRUE))/(max(x, na.rm=TRUE) - min(x, na.rm=TRUE))} 
mydf<-data.frame(blocks=rep(c("a","b","c"),each=5), v1=round(runif(15,10,25),0), v2=round(rnorm(15,30,5),0)) 
g <- mydf$blocks 
l <- split(mydf, g) 
l <- lapply(l, transform, v1.mod = mynorm(v1)) 
mydf_new <- unsplit(l, g) 
############ 
thanks again 
massimo 
______________________________________________ 
R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see 
https://stat.ethz.ch/mailman/listinfo/r-help 
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html 
and provide commented, minimal, self-contained, reproducible code. 
-- 
------------------------------------------------------------ 
Massimo Bressan 
ARPAV 
Agenzia Regionale per la Prevenzione e 
Protezione Ambientale del Veneto 
Dipartimento Provinciale di Treviso 
Via Santa Barbara, 5/a 
31100 Treviso, Italy 
tel: +39 0422 558545 
fax: +39 0422 558516 
e-mail: massimo.bressan at arpa.veneto.it 
------------------------------------------------------------ 
	[[alternative HTML version deleted]]
    
    
More information about the R-help
mailing list