[R] Strange data frame
    McGehee, Robert 
    Robert.McGehee at geodecapital.com
       
    Fri Apr 22 01:02:32 CEST 2005
    
    
  
Hello, 
I'm playing around with the PLS package and found a data set (NIR) whose
structure I don't understand. Forgive me if this is a stupid question,
as I feel like it must be since I am less experienced with aspects of
modeling. 
My problem, the pls NIR data frame does not seem to be a typical data
frame as, while it is a list, its variables are not of equal length.
Furthermore, I have no idea how to reproduce such a structure.
But, let's look at the NIR data...
> require(pls)
> data(NIR)
> class(NIR)
[1] "data.frame"
> str(NIR)
`data.frame':	28 obs. of  3 variables:
 $ X    : num [1:28, 1:268] 3.07 3.07 3.08 3.08 3.10 ...
  ..- attr(*, "dimnames")=List of 2
  .. ..$ : NULL
  .. ..$ : NULL
 $ y    : num  100.0  80.2  79.5  60.8  60.0 ...
 $ train: logi  TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
TRUE ...
> class(NIR$X)
[1] "matrix"
> class(NIR$y)
[1] "numeric"
> length(NIR$X)
[1] 7504
> length(NIR$y)
[1] 28
Ok, what this looks like to me is that NIR is a data frame (i.e. "a list
of variables of the same length with unique row names"), with a matrix
of length 7504 as one variable, and a numeric vector of length 28 as
another variable, which seems to contradict the definition of a data
frame.
Moreover, despite my best efforts, I'm unable to put any of my own data
in this structure, as the data.frame() and as.data.frame() functions
removes the matrix structure i.e. 
> data.frame(y = NIR$y, X = NIR$X) 			## or 
> as.data.frame(list(y = NIR$y, X = NIR$X))
return a different animal altogether.
Lastly, this particular structure is useful, because the PLS authors are
able to concisely write models such as,
mvr(y ~ X, data = NIR[NIR$train, ])
instead of what I imagine would be a more complicated alternative if
they didn't have a data frame of a matrix and a vector as they do. Any
pointers to something I overlooked is appreciated.
Best,
Robert
    
    
More information about the R-help
mailing list