[R] use Vectorized function as range of for statement
    Zhang Weiwu 
    zhangweiwu at realss.com
       
    Thu Aug  1 18:38:01 CEST 2013
    
    
  
I guess this has been discussed before, but I don't know the name of this 
problem, thus had to ask again.
Consider this scenario:
> fun <- function(x) { print(x)}
> for (i in Vectorize(fun, "x")(1:3)) print("OK")
[1] 1
[1] 2
[1] 3
[1] "OK"
[1] "OK"
[1] "OK"
The optimal behaviour is:
> fun <- function(x) { print(x)}
> for (i in Vectorize(fun, "x")(1:3)) print("OK")
[1] 1
[1] "OK"
[1] 2
[1] "OK"
[1] 3
[1] "OK"
That is, each iteration of vectorized function should yield some result for 
the 'for' statement, rather than having all results collected beforehand.
The intention of such a pattern, is to separates the data generation logic 
from data processing logic.
The latter mechanism, I think, is more efficient because it doesn't cache 
all data before processing -- and the interpreter has the sure knowledge 
that caching is not needed, since the vectorized function is not used in 
assignment but as a range.
The difference may be trivial, but this pseud code demonstrates otherwise:
readSample <- function(x) {
 	....
 	sampling_time <- readBin(con, integer(), 1, size=4)
 	sample_count <- readBin(con, integer(), 1, size=2)
 	samples <- readBin(con, float(), sample_count, size=4)
 	....
 	matrix # return a big matrix representing a sample
}
for (sample in Vectorize(readSample, "x")(1:10000)) {
 	# process sample
}
The data file is a few Gigabytes, and caching them is not effortless. Not 
having to cache them would make a difference.
This email asks to 1. validate this need of the langauge; 2. alternative 
design pattern to workaround it; 3. Ask the proper place to discuss this.
Thanks and best...
    
    
More information about the R-help
mailing list