[R] Format of a data frame
Thomas L Jones, PhD
jones3745 at verizon.net
Mon Oct 22 05:02:31 CEST 2007
The goal is to smooth a scatterplot using the LOESS locally weighted
regression program and a gam. There are 156 points. Thus x can have the
value 1, or 2, etc., up to a maximum of x = 156. The y values are random,
with a Poisson distribution, or the next thing to it.
After reading in the data, I was able to generate a model, named mod, as
follows:
mod <- gam(y~lo(x), family=poisson, x = TRUE)
Next, I want to look at some values of the fitted curve: Specifically x =1,
x = 2, and x = 3. Upon looking up predict.gam, I see the following:
Usage
predict.gam (object, newdata, type, dispension, se.fit = FALSE, na.action,
terms ...)
One of the arguments of the function is named newdata. I see:
newdata A data frame containing the values at which predictions are
requested. [snip] Only those predictors, referred to in the
right side of the formula, need be present by name in newdata.
I am having difficulty figuring out the format of the data frame. For
example, how many columns should it have? Should it have a column for the
three values of x? Probably there is a rather standard format for data
frames, but I am having trouble looking it up. Perhaps some one would point
me to the place in the documentation where this is discussed.
Tom Jones
More information about the R-help
mailing list