[Rd] internal copying in R (soon to be released R-3.1.0
Jens Oehlschlägel
Jens.Oehlschlaegel at truecluster.com
Sun Mar 2 18:37:59 CET 2014
Dear core group,
Which operation in R guarantees to get a true copy of an atomic vector,
not just a second symbol pointing to the same shared memory?
y <- x[]
#?
y <- x
y[1] <- y[1]
#?
Is there any function that returns its argument as a non-shared atomic
but only copies if the argument was shared?
Given an atomic vector x, what is the best official way to find out
whether other symbols share the vector RAM? Querying NAMED() < 2 doesn't
work because .Call sets sxpinfo_struct.named to 2. It even sets it to 2
if the argument to .Call was a never-named expression!?
> named(1:3)
[1] 2
And it seems to set it permanently, pure read-access can trigger
copy-on-modify:
> x <- integer(1e8)
> system.time(x[1]<-1L)
User System verstrichen
0 0 0
> system.time(x[1]<-2L)
User System verstrichen
0 0 0
having called .Call now leads to an unnecessary copy on the next assignment
> named(x)
[1] 2
> system.time(x[1]<-3L)
User System verstrichen
0.14 0.07 0.20
> system.time(x[1]<-4L)
User System verstrichen
0 0 0
this not only happens with user written functions doing read-access
> is.unsorted(x)
[1] TRUE
> system.time(x[1]<-5L)
User System verstrichen
0.11 0.09 0.21
Why don't you simply give package authors read-access to
sxpinfo_struct.named in .Call (without setting it to 2)? That would give
us more control and also save some unnecessary copying. I guess once R
switches to reference-counting preventive increasing in .Call could not
be continued anyhow.
Kind regards
Jens Oehlschlägel
P.S. please cc me in answers as I am not member of r-devel
P.P.S. function named() was tentatively defined as follows:
named <- function(x)
.Call("R_bit_named", x, PACKAGE="bit")
SEXP R_bit_named(SEXP x){
SEXP ret_;
PROTECT( ret_ = allocVector(INTSXP,1) );
INTEGER(ret_)[0] = NAMED(x);
UNPROTECT(1);
return ret_;
}
> version
_
platform x86_64-w64-mingw32
arch x86_64
os mingw32
system x86_64, mingw32
status Under development (unstable)
major 3
minor 1.0
year 2014
month 02
day 28
svn rev 65091
language R
version.string R Under development (unstable) (2014-02-28 r65091)
nickname Unsuffered Consequences
More information about the R-devel
mailing list