[R] Integer vs numeric

Henrik Bengtsson hb at stat.berkeley.edu
Wed Jan 30 14:26:47 CET 2008


On Jan 29, 2008 10:40 PM, Christophe Genolini <cgenolin at u-paris10.fr> wrote:
> x[c(2,4)] work as well

My point is that that at the native-code level subsetting/enumeration
is done by integer indices and coercion from double to integer is
always going to less efficient than working directly with integers.
That's likely to be one of the rationales for 1:n being integers (in
addition to being smaller in size).

Also, the coercion as.integer(xs) where xs is a vector of doubles will
all in all take up *three* times the memory compared with
object.size(xs) and just add extra work to the garbage collector.

Finally, working with doubles is not precision safe (there are many
threads with various flavors on the same topic).  Example:

> xs <- seq(1,1.2,by=0.01);
> print(xs);
 [1] 1.00 1.01 1.02 1.03 1.04 1.05 1.06
 [8] 1.07 1.08 1.09 1.10 1.11 1.12 1.13
[15] 1.14 1.15 1.16 1.17 1.18 1.19 1.20

> ys <- as.integer(100*xs);
> print(ys);
 [1] 100 101 102 103 104 105 106 107 108
[10] 109 110 111 112 112 114 114 115 117
[19] 118 119 120 121 122 123 124 125 126
[28] 127 128 129 130

Pay attention to elements 13:18(!) - subsetting using doubles is not safe.

/H

>
> Henrik Bengtsson a écrit :
>
> > x[1:n]
> >
> > /H
> >
> > On Jan 29, 2008 5:07 AM,  <cgenolin at u-paris10.fr> wrote:
> >
> >> Seems strange to me to define an operator relatively to a very special case.
> >> I have to admit that I do not use 1:1e7 every day :-)
> >>
> >> Wouldn't it be more appropriate to define a a:b operator numeric (that
> >> is preserving the initial class of a and b) and in specific case that
> >> need optimization, changing the type?
> >>
> >> for i in as.integer(1:1e7)
> >>
> >> That might appears as a minor point, but when using S4, for what I
> >> know, if you define a class that can take either 1:3 or c(1,3,4), one
> >> is integer, the other numeric, one of those will not be accepted by the
> >> class...
> >>
> >> Christophe
> >>
> >>
> >>
> >>
> >>> On 28-Jan-08 22:40:02, Peter Dalgaard wrote:
> >>>
> >>>> [...]
> >>>> AFAIR, space is/was more of an issue. If you do something like
> >>>>
> >>>> for i in 1:1e7
> >>>>     some.silly.simulation()
> >>>>
> >>>> then you have 40 MB sitting there doing nothing, and 80 MB if
> >>>> it had been floating point.
> >>>>
> >>> Hmmm ... there's something to be said for good old
> >>>
> >>>  for(i=1,i<=1e7,i++){....}
> >>>
> >>> As pointed out in ?"for", when you do
> >>>
> >>>  for(i in X){...}  #(e.g. X=(1:1e7))
> >>>
> >>> the object X is created (or is already there) in full
> >>> at the start and sits there, as you say doing nothing,
> >>> until you end the loop. Whereas the C code just keeps
> >>> track of i and of the condition.
> >>>
> >>> At least on a couple of my machines (64MB and 184MB RAM)
> >>> knocking out 40MB would inflict severe trauma! Let alone 80MB.
> >>> Mind you, the little one is no longer allowed to play with
> >>> big boys like R, though the other one is still used for
> >>> moderate-sized games.
> >>>
> >>> Would there be much of a time penalty in implementing
> >>> a 'for' loop, C-style, as
> >>>
> >>>  i<-1
> >>>  while(i<=1e7){
> >>>    ...
> >>>    i<-i+1
> >>>  }
> >>>
> >>> ??
> >>>
> >>> It looks as though there might be:
> >>>
> >>>  system.time(for(i in (1:1e7)) x<-cos(3) )
> >>>  #[1] 13.521  0.132 13.355  0.000  0.000
> >>>  system.time({i<-1;while(i<=1e7){x<-cos(3);i<-i+1}})
> >>>  #[1] 38.270  0.076 37.629  0.000  0.000
> >>>
> >>> which suggests that the latter is about 3 times as slow.
> >>> (And no, this wasn't done on either of my puny babes).
> >>>
> >>> (And this isn't the first time I've wished for an R
> >>> implementation of "++" as a CPU-level incrementation,
> >>> as opposed to the R-arithmetic implementation which
> >>> treats "adding 1 to a variable" as a full-dress
> >>> arithmetic parade!
> >>>
> >>> Best wishes,
> >>> Ted.
> >>>
> >>> --------------------------------------------------------------------
> >>> E-Mail: (Ted Harding) <Ted.Harding at manchester.ac.uk>
> >>> Fax-to-email: +44 (0)870 094 0861
> >>> Date: 28-Jan-08                                       Time: 23:34:52
> >>> ------------------------------ XFMail ------------------------------
> >>>
> >>>
> >>
> >> ----------------------------------------------------------------
> >> Ce message a ete envoye par IMP, grace a l'Universite Paris 10 Nanterre
> >>
> >>
> >> ______________________________________________
> >> R-help at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.
> >>
> >>
> >
> >
>
>



More information about the R-help mailing list