[R-sig-ME] sums of squares and F values in anova

Wed Sep 24 23:14:31 CEST 2014

On 14-09-24 04:21 PM, Alexandra Kuznetsova wrote:
> Dear lme4 authors,
> 
> I have a question regarding the calculation of sums of squares in
> anova for lmerMod object. I know that there were some discussions
> regarding how they are calculated (but still remains unclear to
> me..).
> 
> As far as I understand the way they are calculated is similar to the
> way they are calculated in lm objects, that is transforming Y into
> orthogonal Q space, and then computing sums of squares for the
> independent effects.
> 
> I have found in your  JSS paper "Fitting linear mixed effects models
> using lme4" some explanations (in equations 67 and 68). Would it be
> possible to give some more comments on these equations? And what
> about the partial (type 3) sums of squares -  is there a way to
> calculate them using the same way?
> 
> Hope my questions were clear! Thank you in advance!

  I'm not sure exactly what your questions are.  Could you please
clarify (sorry!) what you mean about partial sums of squares?  Here's an
example showing that the results of lme4's anova.merMod and base R's:
anova.lm do in fact agree for a model with the random effects variance
forced to (almost) zero.  At the risk of further muddying the water,
I'll point out that car::Anova(fit,type="II") and
car::Anova(fit,type="III") give two different answers for this problem,
neither of which matches the computation below ...

===================
fit <- lm(sr ~ ., data = LifeCycleSavings)
anova(fit)

## construct lmer model with near-zero variance
LC2 <- transform(LifeCycleSavings,f=factor(1:2)) ## bogus
library("lme4")
form <- sr ~ pop15 + pop75 + dpi + ddpi + (1|f)  ## hack
## to avoid (Error in terms.formula(formula(x, fixed.only = TRUE)) :
##   '.' in formula and no 'data' argument)

lmod <- lFormula(form, data=LC2)
d2 <- lmer(form, data=LC2,devFunOnly=TRUE)
llik <- d2(1e-5)
fit2 <- mkMerMod(environment(d2),opt=list(par=1e-5,
                           fval=llik,
                           feval=1,
                           conv=0,
                           message=NULL),
                         lmod$reTrms, fr = lmod$fr)
all.equal(coef(fit),fixef(fit2))
anova(fit2)   ## practically equal to anova(fit) above
==================

For those following along, the paper is at
http://arxiv.org/abs/1406.5823 (or http://arxiv.org/pdf/1406.5823v1 for
a direct link to the PDF), and the raw LaTeX for the specific section is:

===============
To understand how these quantities are computed, let $\bm R_i$ contain
the rows of $\bm R_X$ (Equation~\ref{eq:blockCholeskyDecomp}) associated
with the $i$th fixed-effects term.  Then the sum of squares for term
$i$ is,
\begin{equation}
  \label{eq:SS}
  SS_i = \widehat{\bm\beta}\trans\bm R_i\trans \bm R_i \widehat{\bm\beta}
\end{equation}
If $DF_i$ is the number of columns in $\bm R_i$, then the
$F$~statistic for term $i$ is,
\begin{equation}
  \label{eq:Fstat}
  F_i = \frac{SS_i}{\widehat{\sigma}^2 DF_i}
\end{equation}

> 
> Alexandra Kuznetsova _______________________________________________ 
> R-sig-mixed-models at r-project.org mailing list 
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>