[R] complexity of operations in R

Wed Jul 18 11:32:17 CEST 2012

On 18/07/2012 09:49, Rui Barradas wrote:
> Hello,
>
> Em 18-07-2012 09:06, Patrick Burns escreveu:
>> It looks to me like the following
>> should do what you want:
>>
>> f2 <- function(dotot) array(FALSE, c(dotot, 1))
>>
>> What am I missing?
>
> That matrix is even faster?

Depends on your unstated version of R .... Not so in current R (>= 
2.15.1 patched)

> f2 <- function(dotot) array(FALSE, c(dotot, 1))
> f3 <- function(dotot) matrix(FALSE, dotot, 1)

Using an integer constant for an integer helps if you do this often 
enough (1L, 1000L)

>
> t2 <- system.time(replicate(1e4, f2(1000)))
> t3 <- system.time(replicate(1e4, f3(1000)))
> rbind(t2, t3, t2/t3)
>
> Rui Barradas
>>
>> Pat
>>
>> On 17/07/2012 21:58, Johan Henriksson wrote:
>>> thanks for the link! I should read it through. that said, I didn't find
>>> any good general solution to the problem so here I post some attempts
>>> for general input. maybe someone knows how to speed this up. both my
>>> solutions are theoretically O(n) for creating a list of n elements. The
>>> function to improve is O(n^2) which should suck tremendously - but the
>>> slow execution of R probably blows up the constant factor of the smarter
>>> solutions.
>>>
>>> Array doubling comes close in speed for large lists but it would be
>>> great if it could be comparable for smaller lists. One hidden cost I see
>>> directly is that allocating a list in R is O(n), not O(1) (or close),
>>> since it always fills it with values. Is there a way around this? I
>>> guess by using C, one could just malloc() and leave the content
>>> undefined - but is there no better way?
>>>
>>> thanks,
>>> /Johan
>>>
>>>
>>> ################################
>>> # the function we wish to improve
>>>
>>> f<-function(dotot){
>>>    v<-matrix(ncol=1,nrow=0)
>>>    for(i in 1:dotot){
>>>      v<-rbind(v,FALSE)
>>>    }
>>>    return(v)
>>> }
>>>
>>> ##########################
>>> # first attempt: linked lists
>>>
>>> emptylist <- NA
>>>
>>> addtolist <- function(x,prev){
>>>    return(list(x,prev))
>>> }
>>>
>>> g<-function(dotot){
>>>    v<-emptylist
>>>    for(i in 1:dotot){
>>>      v<-addtolist(FALSE,v)
>>>    }
>>>    return(v)
>>> }
>>>
>>> ####################################
>>> # second attempt: array doubling
>>>
>>> emptyexpandlist<-list(nelem=0,l=matrix(ncol=1,nrow=0))
>>>
>>> addexpandlist<-function(x,prev){
>>>    if(nrow(prev$l)==prev$nelem){
>>>      nextsize<-max(nrow(prev$l),1)
>>>      prev$l<-rbind(prev$l,matrix(ncol=1,nrow=nextsize))
>>>    }
>>>    prev$nelem<-prev$nelem+1
>>>    prev$l[prev$nelem]<-x
>>>    return(prev)
>>> }
>>>
>>> compressexpandlist<-function(prev){
>>>    return(as.vector(prev$l[1:prev$nelem]))
>>> }
>>>
>>> h<-function(dotot){
>>>    v<-emptyexpandlist
>>>    for(i in 1:dotot){
>>>      v<-addexpandlist(FALSE,v)
>>>    }
>>>    return(compressexpandlist(v))
>>> }
>>>
>>> #########################################
>>>
>>> dotot=100000
>>> system.time(f(dotot))
>>> #system.time(g(dotot))
>>> system.time(h(dotot))
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Jul 17, 2012 at 8:42 PM, Patrick Burns <pburns at pburns.seanet.com
>>> <mailto:pburns at pburns.seanet.com>> wrote:
>>>
>>>     Johan,
>>>
>>>     If you don't know 'The R Inferno', it might
>>>     help a little.  Circle 2 has an example of
>>>     how to efficiently (relatively speaking) grow
>>>     an object if you don't know the final length.
>>>
>>>     http://www.burns-stat.com/__pages/Tutor/R_inferno.pdf
>>>     <http://www.burns-stat.com/pages/Tutor/R_inferno.pdf>
>>>
>>>     If you gave a simple example of how your code
>>>     looks now and what you want it to do, then you
>>>     might get some ideas of how to improve it.
>>>
>>>
>>>     Pat
>>>
>>>
>>>     On 17/07/2012 12:47, Johan Henriksson wrote:
>>>
>>>         Hello!
>>>         I am optimizing my code in R and for this I need to know a bit
>>>         more about
>>>         the internals. It would help tremendously if someone could link
>>>         me to a
>>>         page with O()-complexities of all the operations.
>>>
>>>         In this particular case, I need something like a linked list
>>>         with O(1)
>>>         insertLast/First ability. I can't preallocate a vector since I
>>>         do not know
>>>         the final size of the list ahead of time.
>>>
>>>         The classic array-doubling trick would give me O(1) amortized
>>>         time but it's
>>>         a bit messy. The other alternative I see would be to recursively
>>>         store
>>>         lists (elem, (elem, (elem, (...)))), which I take also would
>>>         work? But I'd
>>>         rather go for a standard R solution if there is one!
>>>
>>>         cheers,
>>>         /Johan
>>>
>>>
>>>     --
>>>     Patrick Burns
>>>     pburns at pburns.seanet.com <mailto:pburns at pburns.seanet.com>
>>>     twitter: @portfolioprobe
>>>     http://www.portfolioprobe.com/__blog
>>>     <http://www.portfolioprobe.com/blog>
>>>     http://www.burns-stat.com
>>>     (home of 'Some hints for the R beginner'
>>>     and 'The R Inferno')
>>>
>>>
>>>
>>>
>>>
>>> --
>>> --
>>> -----------------------------------------------------------
>>> Johan Henriksson, PhD
>>> Karolinska Institutet
>>> Ecobima AB - Custom solutions for life sciences
>>> http://www.ecobima.se <http://www.ecobima.com> http://mahogny.areta.org
>>> http://www.endrov.net
>>>
>>> <http://www.endrov.net>
>>
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595