[R-sig-ME] sums of squares and F values in anova
Farrar, David
Farrar.David at epa.gov
Tue Sep 30 17:18:01 CEST 2014
John,
Regarding K-R etc. it's good to know that I can get up-to-date statistical tests on lmer() output. I can use this in a current analysis. For a literature citation, I can say that statistical tests were conducted using the car package. However, current documentation indicates that Anova can be used with lmer(), and that pbkrtest is recommended along with car. One may therefore infer what is likely going on. It might be good to see a more direct statement in ?Anova.
Any thoughts about literature citation?
fwiw I'm a fan of car and the associated text.
I may also do some parametric bootstrapping.
David Farrar,
-----Original Message-----
From: r-sig-mixed-models-bounces at r-project.org [mailto:r-sig-mixed-models-bounces at r-project.org] On Behalf Of John Fox
Sent: Wednesday, September 24, 2014 6:04 PM
To: Ben Bolker
Cc: r-sig-mixed-models at r-project.org; Alexandra Kuznetsova
Subject: Re: [R-sig-ME] sums of squares and F values in anova
Dear Ben,
anova() computes sequential ("type I") tests, so one wouldn't for correlated Xs expect the result to be the same as the "type II" or "type III" tests computed by Anova(), which, however, for an additive model should be the same as each other. You *would* expect the last test in the sequence produced by anova() to be the same as the corresponding type-II or -III test.
In the case of LMMs fit by lmer(), Anova() computes Wald F-tests using the Kenward-Roger coefficient covariance matrix and Satterthwaite df; for the data/model in your example, that should produce a small difference in the F-statistics.
Here's what I get for your example:
------------- snip ----------
> anova(fit)
Analysis of Variance Table
Response: sr
Df Sum Sq Mean Sq F value Pr(>F)
pop15 1 204.12 204.118 14.1157 0.0004922 ***
pop75 1 53.34 53.343 3.6889 0.0611255 .
dpi 1 12.40 12.401 0.8576 0.3593551
ddpi 1 63.05 63.054 4.3605 0.0424711 *
Residuals 45 650.71 14.460
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> anova(fit2) ## practically equal to anova(fit) above
Analysis of Variance Table
Df Sum Sq Mean Sq F value
pop15 1 204.118 204.118 14.1157
pop75 1 53.343 53.343 3.6889
dpi 1 12.401 12.401 0.8576
ddpi 1 63.054 63.054 4.3605
> Anova(fit)
Anova Table (Type II tests)
Response: sr
Sum Sq Df F value Pr(>F)
pop15 147.01 1 10.1666 0.002603 **
pop75 35.24 1 2.4367 0.125530
dpi 1.89 1 0.1309 0.719173
ddpi 63.05 1 4.3605 0.042471 *
Residuals 650.71 45
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> Anova(fit, type=3)
Anova Table (Type III tests)
Response: sr
Sum Sq Df F value Pr(>F)
(Intercept) 218.16 1 15.0867 0.0003338 ***
pop15 147.01 1 10.1666 0.0026030 **
pop75 35.24 1 2.4367 0.1255298
dpi 1.89 1 0.1309 0.7191732
ddpi 63.05 1 4.3605 0.0424711 *
Residuals 650.71 45
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> Anova(fit2)
Analysis of Deviance Table (Type II Wald chisquare tests)
Response: sr
Chisq Df Pr(>Chisq)
pop15 10.1666 1 0.00143 **
pop75 2.4367 1 0.11852
dpi 0.1309 1 0.71748
ddpi 4.3605 1 0.03678 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> Anova(fit2, test="F")
Loading required package: pbkrtest
Loading required package: MASS
Analysis of Deviance Table (Type II Wald F tests with Kenward-Roger df)
Response: sr
F Df Df.res Pr(>F)
pop15 10.1519 1 44.031 0.00265 **
pop75 2.4326 1 44.036 0.12599
dpi 0.1245 1 44.811 0.72588
ddpi 4.2179 1 44.600 0.04589 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
> Anova(fit2, test="F", type=3)
Analysis of Deviance Table (Type III Wald F tests with Kenward-Roger df)
Response: sr
F Df Df.res Pr(>F)
(Intercept) 14.9643 1 44.560 0.0003538 ***
pop15 10.1519 1 44.031 0.0026503 **
pop75 2.4326 1 44.036 0.1259919
dpi 0.1245 1 44.811 0.7258840
ddpi 4.2179 1 44.600 0.0458898 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
----------------- snip ----------
These results are as expected.
Best,
John
On Wed, 24 Sep 2014 17:14:31 -0400
Ben Bolker <bbolker at gmail.com> wrote:
> On 14-09-24 04:21 PM, Alexandra Kuznetsova wrote:
> > Dear lme4 authors,
> >
> > I have a question regarding the calculation of sums of squares in
> > anova for lmerMod object. I know that there were some discussions
> > regarding how they are calculated (but still remains unclear to
> > me..).
> >
> > As far as I understand the way they are calculated is similar to the
> > way they are calculated in lm objects, that is transforming Y into
> > orthogonal Q space, and then computing sums of squares for the
> > independent effects.
> >
> > I have found in your JSS paper "Fitting linear mixed effects models
> > using lme4" some explanations (in equations 67 and 68). Would it be
> > possible to give some more comments on these equations? And what
> > about the partial (type 3) sums of squares - is there a way to
> > calculate them using the same way?
> >
> > Hope my questions were clear! Thank you in advance!
>
> I'm not sure exactly what your questions are. Could you please
> clarify (sorry!) what you mean about partial sums of squares? Here's
> an example showing that the results of lme4's anova.merMod and base R's:
> anova.lm do in fact agree for a model with the random effects variance
> forced to (almost) zero. At the risk of further muddying the water,
> I'll point out that car::Anova(fit,type="II") and
> car::Anova(fit,type="III") give two different answers for this
> problem, neither of which matches the computation below ...
>
> ===================
> fit <- lm(sr ~ ., data = LifeCycleSavings)
> anova(fit)
>
>
> ## construct lmer model with near-zero variance
> LC2 <- transform(LifeCycleSavings,f=factor(1:2)) ## bogus
> library("lme4")
> form <- sr ~ pop15 + pop75 + dpi + ddpi + (1|f) ## hack ## to avoid
> (Error in terms.formula(formula(x, fixed.only = TRUE)) :
> ## '.' in formula and no 'data' argument)
>
> lmod <- lFormula(form, data=LC2)
> d2 <- lmer(form, data=LC2,devFunOnly=TRUE) llik <- d2(1e-5)
> fit2 <- mkMerMod(environment(d2),opt=list(par=1e-5,
> fval=llik,
> feval=1,
> conv=0,
> message=NULL),
> lmod$reTrms, fr = lmod$fr)
> all.equal(coef(fit),fixef(fit2))
> anova(fit2) ## practically equal to anova(fit) above
> ==================
>
> For those following along, the paper is at
> http://arxiv.org/abs/1406.5823 (or http://arxiv.org/pdf/1406.5823v1
> for a direct link to the PDF), and the raw LaTeX for the specific section is:
>
> ===============
> To understand how these quantities are computed, let $\bm R_i$ contain
> the rows of $\bm R_X$ (Equation~\ref{eq:blockCholeskyDecomp})
> associated with the $i$th fixed-effects term. Then the sum of squares
> for term $i$ is, \begin{equation}
> \label{eq:SS}
> SS_i = \widehat{\bm\beta}\trans\bm R_i\trans \bm R_i
> \widehat{\bm\beta} \end{equation} If $DF_i$ is the number of columns
> in $\bm R_i$, then the $F$~statistic for term $i$ is, \begin{equation}
> \label{eq:Fstat}
> F_i = \frac{SS_i}{\widehat{\sigma}^2 DF_i} \end{equation}
>
>
> >
> > Alexandra Kuznetsova _______________________________________________
> > R-sig-mixed-models at r-project.org mailing list
> > https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
> >
>
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
_______________________________________________
R-sig-mixed-models at r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
More information about the R-sig-mixed-models
mailing list