[R-sig-ME] Model selection in GAMM

Mari Laine mslain at utu.fi
Wed Mar 27 13:00:32 CET 2013


Hi to all,

thanks for your help, I think I'll try to avoid AIC comparison then.

   - Mari Laine -


> On Mon, Mar 18, 2013 at 4:47 PM, Joshua Wiley <jwiley.psych at
> gmail.com> wrote:
> 
> Sorry, I missed the "temporal structures" bit from the OP.  Apologies
> for the noise.
> 
> On Mon, Mar 18, 2013 at 4:46 PM, Joshua Wiley <jwiley.psych at
> gmail.com> wrote:
> > A bit off from the other topics, but back to the original, if you use
> > gamm4, it uses lme4 (lmer/glmer) for the mixed model.  One implication
> > is that you can get true likelihoods using the Laplace approximation
> > or numerical integration.  So at least for comparing models with
> > different parametric effects, it would seem you could use that
> > approach.  I am uncertain for the overall model, because you get both
> > a gam and mer object.
> >
> > On Mon, Mar 18, 2013 at 6:29 AM, Ben Bolker <bbolker at gmail.com>
> wrote:
> >> Andrew Robinson <A.Robinson at ...> writes:
> >>
> >>>
> >>> On Sun, Mar 17, 2013 at 07:44:28PM +0000, Ben Bolker wrote:
> >>> > Mari Laine <mslain at ...> writes:
> >>> >
> >>> >
> >>> > > I'm using GAMM (mgcv::gamm) to fit non-Gaussian models with
> several
> >>> > > fixed factors and also a temporal structure. Additionally, I'm
> >>> > > fitting non-linearity with smoothers. I have several correlated main
> >>> > > effect candidates, and I would like to compare their GAMMs and
> see,
> >>> > > which one of the models best fits the data. I would also need to
> >>> > > tune the models via dropping unnecessary smoothers (linear ones)
> and
> >>> > > non-significant fixed variables.
> >>> >
> >>> > > After going through some statistical books, I'm under the impression
> >>> > > that one should not use AIC comparison of $lme's for GAMM models
> -
> >>> > > is this correct? Could someone give instruction on the model
> >>> > > selection in GAMM or refer me to a book / some other source of
> >>> > > information on this matter?
> >>> >
> >>> >   To my knowledge there are two issues here:
> >>> >
> >>> > (1) GAMM uses penalized quasi-likelihood.  According to some
> >>> > statisticians (including Brian Ripley, who wrote the original PQL
> >>> > code in MASS::glmmPQL, which might be what GAMM relies on -- I
> don't
> >>> > remember, it might incorporate its own PQL code), one shouldn't use
> >>> > likelihood-based approaches (including AIC) with PQL algorithms,
> >>> > because they don't estimate a true likelihood (others say it's OK as
> >>> > long as you make sure to scale the likelihood to get a
> >>> > quasi-likelihood before combining it with the penalty term to get a
> >>> > QIC).
> >>>
> >>> Hi Ben,
> >>>
> >>> I know that you're reporting third-party opinions, and you're not
> >>> necessarily advocating this position yourself, but I wonder if you can
> >>> provide some more information - even a link or a citation to them?
> >>
> >>   Hmm.  I may be in over my head here, so *please* treat anything
> >> I say below as false until proven true ...
> >>
> >>>
> >>> I'm a bit confused about how scaling the maximized penalized
> >>> likelihood (MPL) can deliver something that can be treated as though
> >>> it were a maximized likelihood.
> >>>
> >>> To my way of thinking, a point of maximizing the penalized likelihood
> >>> is that you get an estimate with better statistical properties than
> >>> the MLE.
> >>
> >>   It's been a long time since I looked at this, and I don't understand
> >> it well enough, but it was my understanding that the quantity maximized
> >> by PQL is an *approximation* to the marginal likelihood used in all
> >> other GLMM applications (i.e. the conditional likelihood of the data
> >> given fixed values of the random effects, weighted by the probability
> >> of those RE values given the distribution of the REs, integrated over
> >> all possible values of the REs) -- one that is less accurate than
> >> Laplace approximation and Gauss-Hermite quadrature, but an
> approximation
> >> to the same quantity nevertheless.
> >>
> >>   So I thought the idea was not (in this case) to get an estimate
> >> with better properties than the MLE, but to get an estimate of the
> >> marginal likelihood at all ...
> >>
> >>> It is overwhelmingly likely that this MPL estimate will be different
> >>> from the MLE.  It's hard for me to imagine a way that the PL function
> >>> can be scaled so that the MPLE can be treated as though it were an
> >>> MLE.  If the MPLE won't be the same as the MLE, then the L can't be
> >>> maximized at the MPLE.  Further, the PL function will be a different
> >>> shape at the optimum (I intuit) than the L would be at its optimum.
> >>
> >>     For one quick practical counterexample, glmmPQL generally gives
> >> _similar_ answers to glmer/glmmADMB etc. (i.e. methods based on
> >> Laplace or better approximations) for large data sets and those
> >> where the counts per individual sample are large ...
> >>>
> >>> So, is there theory to suggest that the properties of the MLE that are
> >>> relied upon by the various measures of information are retained by the
> >>> MPLE?  And if so, even then, wouldn't those properties depend on the
> >>> nature of the penalization?
> >>
> >>    I would love to know the answer.  I hope someone more
> knowledgeable,
> >> or with much more time to spend figuring this out, comes forward.
> >>
> >>   Ben
> >>
> >>
> >>>
> >>> > (2) As I recall it's a little tricky to figure out which components
> >>> > of a GAMM call contain which kinds of information about the fit.
> >>> > In particular it's not clear whether the likelihood/AIC reported
> >>> > in the lme component of the fit really reflect an appropriate
> >>> > (quasi)likelihood/IC of the full model; I believe there's a lot
> >>> > of detail on this in the ?gamm help page: in particular,
> >>> >
> >>> > > For example,
> >>> > > unlike 'glmmPQL' from 'MASS' it will return the complete 'lme'
> >>> > > object from the working model at convergence of the PQL iteration,
> >>> > > including the `log likelihood', even though this is not the
> >>> > > likelihood of the fitted GAMM.
> >>> >
> >>> > which suggests you shouldn't use that log-likelihood ...
> >>> >
> >>> >   I would look for further information in Simon Wood's excellent book
> on
> >>> > generalized additive models.
> >>> >
> >>> > _______________________________________________
> >>> > R-sig-mixed-models <at> r-project.org mailing list
> >>> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >>>
> >>
> >> _______________________________________________
> >> R-sig-mixed-models at r-project.org mailing list
> >> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
> >
> >
> > --
> > Joshua Wiley
> > Ph.D. Student, Health Psychology
> > University of California, Los Angeles
> > http://joshuawiley.com/
> > Senior Analyst - Elkhart Group Ltd.
> > http://elkhartgroup.com
> 
> 
> 
> --
> Joshua Wiley
> Ph.D. Student, Health Psychology
> University of California, Los Angeles
> http://joshuawiley.com/
> Senior Analyst - Elkhart Group Ltd.
> http://elkhartgroup.com



More information about the R-sig-mixed-models mailing list