[R-sig-ME] lmer() with 'na.action=na.exclude'; error with summary()

Martin Maechler maechler at stat.math.ethz.ch
Tue Sep 9 16:02:20 CEST 2014


>>>>> John Maindonald <john.maindonald at anu.edu.au>
>>>>>     on Mon, 8 Sep 2014 23:31:14 +0000 writes:

    > On 9 Sep 2014, at 2:50, Ben Bolker <bbolker at gmail.com>
    > wrote:
    >> John Maindonald <john.maindonald at ...> writes:
    >> 
    >>> 
    >>> The following demonstrates the issue:
    >>> 
    >>>> library(DAAG) science.lmer <- lmer(like ~ sex + PrivPub
    >>>> + (1 | school) +
    >>> + (1 | school:class), data = science, +
    >>> na.action=na.exclude)
    >>>> summary(science.lmer)
    >>> Linear mixed model fit by REML ['lmerMod']
    >> 
    >> [snip]
    >> 
    >>> Scaled residuals: Error in quantile.default(resids) :
    >>> missing values and NaN's not allowed if 'na.rm' is FALSE
    >>>> ## Suppress details of residuals
    >> 
    >> 
    >>>> summary(science.lmer, show.resids=FALSE)
    >> 
    >> This is a confusion between the arguments of the summary
    >> method (summary.merMod) and the *print method*
    >> (print.summary.merMod).  We should add a warning to
    >> summary that says it is discarding unused arguments (the
    >> ... in the model definition is only there for
    >> compatibility with the summary() generic method).
    >> 
    >> print(summary(science.lmer), show.resids=FALSE)
    >> 
    >> works fine.

    > That is a subtlety that I had not contemplated.  It is not
    > in forming the summary object, but in printing it that the
    > problem is evident.

    >> I doubt that there is an intention for summary.merMod()
    >> to throw an error
    >>> lmer() has been called with 'na.action=na.exclude’.  It
    >>> should certainly not throw an error with the argument
    >>> 'show.resids=FALSE’.
    >> 
    >> Now I'm wondering whether the correct behaviour when
    >> there are NAs in the (extended) residuals is
    >> 
    >> 1 omit NAs from the residual quantile calculation
    >> (easiest)

       > I see no reason not to choose (1), At the time we prepared
       > the 3rd edition of 'Data Analysis and Graphics using R’,
       > this was the behaviour. My understanding of
       > ‘na.action=na.exclude’ has been that the result is as for
       > ‘na.action=na.omit’, except that residuals match up with
       > the original observations, with NAs inserted as necessary.
       > In other words, the change from ‘na.action=na.omit’ to
       > ‘na.action=na.exclude’ is all about wharf appears when one
       > explicitly calculates and returns residuals, not about
       > what happens when summary information (quantiles, not the
       > residuals themselves!) is printed.

       > This strategy also simplifies documenting and describing
       > what is done.

    >> 2 return NA values for all quantiles of the residual 

    >> 3 return the quantiles plus a statement of the number of NAs.
    >> 
    >> Any good reason not to just do #1?

We should  follow the behavior of lm()
in all such cases of doubt unless we have very very strong
reasons not to follow lm().
It has been *the* example and motivator to a big extent AFAIR
for the  na.action  semantics when they were introduced (in S).

Martin

    > John Maindonald email: john.maindonald at anu.edu.au phone :
    > +61 2 (6125)3473 fax : +61 2(6125)5549 Centre for
    > Mathematics & Its Applications, Room 1194, John Dedman
    > Mathematical Sciences Building (Building 27) Australian
    > National University, Canberra ACT 0200.



More information about the R-sig-mixed-models mailing list