[R] How to avoid overfitting in gam(mgcv)

=?ISO-2022-JP?B?GyRCP0BMbk0tQDgbKEI=?= 10dimensioner at gmail.com
Wed Oct 3 07:55:04 CEST 2007


Dear listers,

I'm using gam(from mgcv) for semi-parametric regression on small and
noisy datasets(10 to 200
observations), and facing a problem of overfitting.

According to the book(Simon N. Wood / Generalized Additive Models: An
Introduction with R), it is
suggested to avoid overfitting by inflating the effective degrees of
freedom in GCV evaluation with
increased "gamma" value(e.g. 1.4). But in my case, it didn't make a
significant change in the
results.

The only way I've found to suppress overfitting is to set the basis
dimension "k" at very low values
(3 to 5). However, I don't think this is reasonable because knots
selection will then be an
important issue.

Is there any other means to avoid overfitting when alalyzing small datasets?

Thank you for your help in advance,
Ariyo Kanno

--
Ariyo Kanno
1st-year doctor's degree student at
Institute of Environmental Studies,
The University of Tokyo



More information about the R-help mailing list