[R] Performance & capacity characteristics of R?
Thomas Vogels
tov at infiniti.ece.cmu.edu
Tue Aug 3 16:04:46 CEST 1999
"Brian" == Prof Brian D Ripley <ripley at stats.ox.ac.uk> writes:
Brian> Can you tell us what statistical procedures need 1 million to 100s of
Brian> millions or rows (observations)? Some of us have doubted that there are
Brian> even datasets of 100,000 examples that are homogeneous and for which a
Brian> small subsample would not give all the statistical information. (If they
Brian> are not homogeneous, one could/should analyse homogeneous subsets and do a
Brian> meta-analysis.)
What if your problem is to find the outliers in a dataset? It would
be nice to examine the (homogeneous part of) the dataset and then
search for the data entries "that don't quite fit in" without having
to leave R and going to Perl or other home-grown software.
I'm looking at data from experiments in the semi-conductor industry.
It's not uncommon for us to have e.g. parametric measurements
available for a lot of integrated circuits (even > 100,000) and it
would be nice to read them into R (maybe one set of measurements at a
time?).
Thanks,
-tom
--
mailto:tov at ece.cmu.edu (Tom Vogels) Tel: (412) 268-6638 FAX: -3204
-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-.-
r-help mailing list -- Read http://www.ci.tuwien.ac.at/~hornik/R/R-FAQ.html
Send "info", "help", or "[un]subscribe"
(in the "body", not the subject !) To: r-help-request at stat.math.ethz.ch
_._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._._
More information about the R-help
mailing list