[R] curiosity: next-gen x86 processors and FP32?
ivo welch
ivo.welch at anderson.ucla.edu
Sun May 26 07:43:51 CEST 2013
dear R experts:
although my question may be better asked on the HPC R mailing list, it
is really about something that average R users who don't plan to write
clever HPC-optimized code would care about: is there a quantum
performance leap on the horizon with CPUs?
like most R average non-HPC users, I want to stick mostly to
mainstream R, often with library parallel but that's it. I like R to
be fast and effortless. I don't want to have to rewrite my code
greatly to take advantage of my CPU. the CUDA forth-and-back on the
memory which requires code rewrites makes CUDA not too useful for me.
in fact, I don't even like setting up computer clusters. I run code
only on my single personal machine.
now, I am looking at the two upcoming processors---intel haswell (next
month) and amd kaveri (end of year). does either of them have the
potential to be a quantum leap for R without complex code rewrites?
I presume that any quantum leaps would have to come from R using a
different numerical vector "engine". (I tried different compiler
optimizations when compiling R (such as AVX) on the 1-year old i7-27*,
but it did not really make a difference in basic R benchmarks, such as
simple OLS calculations. I thought AVX would provide a faster vector
engine, but something didn't really compute here. pun intended.)
I would guess that haswell will be a nice small evolutionary step
forward. 5-20%, perhaps. but nothing like a factor 2.
[tomshardware details how intel FP32 math is 4 times as fast as double
math on the i7 architecture. for most of my applications, a 4 times
speedup at a sacrifice in precision would be worth it. R seems to use
only doubles---even as.single is not even converting to single, much
less inducing calculations to be single-precision. so I guess this is
a no-go. correct?? ]
kaveri's hUMA on the other hand could be a quantum leap. kaveri could
have the GPU transparently offer common standard built-in vector
operations that we use in R, i.e., improve the speed of many programs
without the need for a rewrite, by a factor of 5? hard to believe,
but it would seem that AMD actually beat Intel for R users. a big
turnaround, given their recent deemphasis of FP on the CPU.
(interestingly, the amd-built Xbox One and PS4 processors were also
reported to have hUMA.)
worth waiting for kaveri? anything I can do to drastically speed up
R on intel i7 by going to FP32?
regards,
/iaw
----
Ivo Welch (ivo.welch at gmail.com)
More information about the R-help
mailing list