[Rd] Data frames and row names
Henrik Bengtsson
hb at stat.berkeley.edu
Tue Aug 15 03:16:04 CEST 2006
In R-devel v2.4.0 NEWS:
o The 'row.names' of a data frame may be stored internally as an
integer or character vector. This can result in considerably
more compact storage (and more logical row names from rbind)
when the row.names are 1:nrow(x). However, such data frames
are not compatible with earlier versions of R: this can be
ensured by supplying a character vector as 'row.names'.
This is great.
With row.names == NULL for 1:nrow(x) the storage would be even more
compact. I noticed that the number of rows is inferred from row
names:
> dim.data.frame
function (x)
c(length(attr(x, "row.names")), length(x))
<environment: namespace:base>
but couldn't the number of rows be inferred from the first column, if
there are no row names? I realize that this would break the case with
zero-column data frames, e.g.
> df <- data.frame(a=1:10)
> df[,-1]
NULL data frame with 10 rows.
...but maybe there is a way around that too.
Cheers
/H
More information about the R-devel
mailing list