[R] GLM results different from GAM results without smoothing terms

Sun Jan 6 18:20:47 CET 2008

On Thursday 03 January 2008 13:54, Prof Brian Ripley wrote:
> > fit1 <- glm(factor(x1)~factor(Round)+x2,family=binomial(link="probit"))
> > fit2 <- gam(factor(x1)~factor(Round)+x2,family=binomial(link="probit"))
> > all.equal(fitted(fit1), fitted(fit2))
>
> [1] TRUE
>
> so the fits to the data are the same: your error was in over-interpreting
> the parameters in the presence on non-identifiability.
>
-- so coming back to the original question, mgcv::gam is using an SVD approach 
to rank deficiency in this case (so the minumum norm parameter vector is 
chosen amongst all those corresponding to the best fit), while glm is using a 
pivoted QR approach to rank deficiency, and effectively constraining 
redundant parameters to zero.

> On Thu, 3 Jan 2008, Daniel Malter wrote:
> > Thanks much for your response. My apologies for not putting sample code
> > in the first place. Here it comes:
> >
> > Round=rep(1:10,each=10)
> > x1=rbinom(100,1,0.3)
> > x2=rep(rnorm(10,0,1),each=10)
> >
> > summary(glm(factor(x1)~factor(Round)+x2,family=binomial(link="probit")))
> >
> > library(mgcv)
> > summary(gam(factor(x1)~factor(Round)+x2,family=binomial(link="probit")))
> >
> > Cheers,
> > Daniel
> >
> > -------------------------
> > cuncta stricte discussurus
> > -------------------------
> >
> > -----Ursprüngliche Nachricht-----
> > Von: Prof Brian Ripley [mailto:ripley at stats.ox.ac.uk]
> > Gesendet: Thursday, January 03, 2008 2:13 AM
> > An: Daniel Malter
> > Cc: r-help at stat.math.ethz.ch
> > Betreff: Re: [R] GLM results different from GAM results without smoothing
> > terms
> >
> > On Wed, 2 Jan 2008, Daniel Malter wrote:
> >> Hi, I am fitting two models, a generalized linear model and a
> >> generalized additive model, to the same data. The R-Help tells that "A
> >> generalized additive model (GAM) is a generalized linear model (GLM)
> >> in which the linear predictor is given by a user specified sum of
> >> smooth functions of the covariates plus a conventional parametric
> >> component of the linear predictor." I am fitting the GAM without
> >> smooth functions and would have expected the parameter estimates to be
> >
> > equal to the GLM.
> >
> >> I am fitting the following model:
> >>
> >> reg.glm=glm(YES~factor(RoundStart)+DEP+SPD+S.S+factor(LOST),family=bin
> >> omial(
> >> link="probit"))
> >> reg.gam=gam(YES~factor(RoundStart)+DEP+SPD+S.S+factor(LOST),family=bin
> >> omial(
> >> link="probit"))
> >>
> >> DEP, SPD, S.S, and LOST are invariant across the observations within
> >> the same RoundStart. Therefore, I would expect to get NAs for these
> >> parameter estimates.
> >
> > So your design matrix is rank-deficient and there is an identifiability
> > problem.
> >
> >> I get NAs in GLM, but I get estimates in GAM. Can anyone explain why
> >> that is?
> >
> > Because there is more than one way to handle rank deficiency.  There are
> > two different 'gam' functions in contributed packages for R (and none in
> > R itself), so we need more details: see the footer of this message. In
> > glm() the NA estimates are treated as zero for computing predictions.
> >
> >> Thanks much,
> >> Daniel
> >>
> >> https://stat.ethz.ch/mailman/listinfo/r-help
> >> PLEASE do read the posting guide
> >> http://www.R-project.org/posting-guide.html
> >> and provide commented, minimal, self-contained, reproducible code.

-- 
> Simon Wood, Mathematical Sciences, University of Bath, Bath, BA2 7AY UK
> +44 1225 386603  www.maths.bath.ac.uk/~sw283