[Rd] Data frames and row names

Henrik Bengtsson hb at stat.berkeley.edu
Tue Aug 15 09:05:31 CEST 2006


On 8/14/06, Prof Brian Ripley <ripley at stats.ox.ac.uk> wrote:
> On Mon, 14 Aug 2006, Henrik Bengtsson wrote:
>
> > In R-devel v2.4.0 NEWS:
> >
> >     o The 'row.names' of a data frame may be stored internally as an
> >       integer or character vector.  This can result in considerably
> >       more compact storage (and more logical row names from rbind)
> >       when the row.names are 1:nrow(x).  However, such data frames
> >       are not compatible with earlier versions of R: this can be
> >       ensured by supplying a character vector as 'row.names'.
> >
> > This is great.
> >
> > With row.names == NULL for 1:nrow(x) the storage would be even more
> > compact.
>
> A few bytes more compact.  Some day you may get up to the next few lines
> of NEWS which say
>
>         The internal storage of row.names = 1:n just records 'n' for
>         efficiency with very long vectors.
>
> (BTW, this is four months' old news, hence my 'some day' comment.)

What is a very long vector?  I would really like to see this for short
vectors too, because in my case I would have half a million data
frames with 10-20 rows (microarray data) and that adds up to 30-70Mb
of memory.

Just curious if you just store 'n', how do you tell if the row.names
== n or 1:n?

Thanks.

/H

>
>
> >  I noticed that the number of rows is inferred from row
> > names:
> >
> > > dim.data.frame
> > function (x)
> > c(length(attr(x, "row.names")), length(x))
> > <environment: namespace:base>
> >
> > but couldn't the number of rows be inferred from the first column, if
> > there are no row names?  I realize that this would break the case with
> > zero-column data frames, e.g.
> >
> > > df <- data.frame(a=1:10)
> > > df[,-1]
> > NULL data frame with 10 rows.
> >
> > ...but maybe there is a way around that too.
>
> Yes, see above.
>
> --
> Brian D. Ripley,                  ripley at stats.ox.ac.uk
> Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
> University of Oxford,             Tel:  +44 1865 272861 (self)
> 1 South Parks Road,                     +44 1865 272866 (PA)
> Oxford OX1 3TG, UK                Fax:  +44 1865 272595
>



More information about the R-devel mailing list