[R-sig-ME] Fwd: RE: [R] Help with lsmeans
Ben Bolker
bbolker at gmail.com
Tue Sep 23 00:21:43 CEST 2014
(Discussion of possible interest to others on r-sig-mixed-models about
running lsmeans::ref.grid for large data sets ...)
-------- Forwarded Message --------
Subject: RE: [R] Help with lsmeans
Date: Mon, 22 Sep 2014 22:05:54 +0000
From: Lenth, Russell V <russell-lenth at uiowa.edu>
To: Ben Bolker <bbolker at gmail.com>, Dan Dillon <dgdillon at gmail.com>,
Søren Højsgaard <sorenh at math.aau.dk>
All:
Thanks for calling this to my attention.
There **is** a way to disable that time-consuming step… just set the
*lsmeans* option disable.pbrtest to TRUE. Here’s Ben’s example, run both
ways, showing the timing and the covariance matrix obtained:
> system.time(rr <- ref.grid(m2))
user system elapsed
13.39 1.66 15.05
> rr at V
(Intercept) x
(Intercept) 9.900968e-05 -0.0001482768
x -1.482768e-04 0.0002955296
> lsm.options(disable.pbkrtest=TRUE)
> system.time(rrd <- ref.grid(m2))
user system elapsed
0.11 0.00 0.10
> rrd at V
(Intercept) x
(Intercept) 9.900036e-05 -0.0001482582
x -1.482582e-04 0.0002954926
In the second result, the V matrix is just the result of vcov(mm)instead
of vcovAdj(mm). Note also that you can also set the degrees of freedom
to any desired value in the lsmeans or ref.grid call:
> lsmeans(m2, "x", at = list(x = c(.3,.5,.7)), options = list(df = 24))
x lsmean SE df lower.CL upper.CL
0.3 0.009617796 0.006053079 24 -0.002875145 0.02211074
0.5 0.006692900 0.004961380 24 -0.003546885 0.01693269
0.7 0.003768004 0.006019155 24 -0.008654921 0.01619093
Confidence level used: 0.95
(Setting df to NAwill produce asymptotic results.)
Hope this helps.
Russ
*From:*Ben Bolker [mailto:bbolker at gmail.com]
*Sent:* Monday, September 22, 2014 3:06 PM
*To:* Dan Dillon; Søren Højsgaard; Lenth, Russell V
*Subject:* Re: [R] Help with lsmeans
PS: smaller example, showing the same thing (and the equivalent just
with pbkrtest)
library("lme4")
library("lsmeans")
set.seed(101)
dd <- data.frame(y=rnorm(4e4),x=runif(4e4),f=gl(n=200,k=200))
m2 <- lmer(y~x+(1|f),data=dd)
m0 <- lmer(y~1+(1|f),data=dd)
rr <- ref.grid(m2)
library("pbkrtest")
kk <- KRmodcomp(m2,m0)
On Mon, Sep 22, 2014 at 11:09 AM, Dan Dillon <dgdillon at gmail.com
<mailto:dgdillon at gmail.com>> wrote:
Hi Ben. Apologies if I should be sending this to the list instead of
directly you, but I am still having trouble with lsmeans and wonder
if you can help. I have attached my data and list my code below. In
a nutshell, lsmeans works fine when I direct it towards a logistic
model where my DV is accuracy (coded 0 or 1), but it hangs for
hours, uses up all the application memory on my machine, and does
not generate a result when I direct it towards a linear model where
my DV is RT (response time).
I took your advice and tried working with subsets of the data, and
found that lsmeans can handle the RT model if the data set is small
enough. I have data from four sites: US, UK, JP, and SA. If I run a
model on RT data from three cases from the US, lsmeans runs for ~1
minute and generates a result. If I scale up and use all the data
from just the US and UK sites, lsmeans works for 20+ minutes but
gives a result. However, if I throw all the RT data at it, lsmeans
hangs for hours and uses up all my application memory without
generating a result. By contrast, if I use all the data but accuracy
is the DV (i.e., logistic model), I get an answer in seconds.
If you have any other ideas or thoughts here, I would greatly
appreciate it. I am new to R and lme4, and I come from a tradition
(experimental psychology) where many reviewers will want to see
follow-up test for interactions. lsmeans looks like the right tool
for the job, so I'd love to get it running for both my accuracy and
RT data.
Thanks for any help you can provide--code below.
Dan Dillon
# The data are from a flanker task, where stimuli are congruent or
incongruent (CON or INC). I have two groups of subjects.
flk = read.csv('my_data.csv')
flk$group <- as.factor(flk$group)
fm.acc <- glmer(accuracy ~ site + group*stimulus + (1|subject),
family = binomial, data = flk)
acc.rg <- ref.grid(fm.acc) # Works fine
print(acc.rg) # Works fine
lsmeans(fm.acc, ~group|stimulus) # Works fine
flk$accuracy <- as.factor(flk$accuracy) # I want to include accuracy
as a factor in the RT model
fm.rt <- lmer(rt ~ site + group*stimulus*accuracy + (1|subject),
data = flk) # Works fine, model results look good
rt.rg <- ref.grid(fm.rt) # Does not work, hangs for hours.
On Tue, Aug 26, 2014 at 5:07 PM, Ben Bolker <bbolker at gmail.com
<mailto:bbolker at gmail.com>> wrote:
Dan Dillon <dgdillon <at> gmail.com <http://gmail.com>> writes:
>
> Colleagues:
>
[snip]
>
> My data are from a behavioral experiment in which two groups
of subjects
> complete 200+ trials of a task with two conditions. Each
subject is tested
> in one of four separate locations. I record accuracy (0 or 1)
and response
> time (RT) on each trial--these are the DVs for the two
regressions. Thus,
> my dataframe has columns "location", "group", "subject", "trial",
> "condition", "accuracy", and "RT".
>
> The regression model for accuracy looks like this:
>
> acc.fm <http://acc.fm> = glmer(accuracy ~ location +
group*condition + (1|subject),
> family=binomial, data=my_data)
>
> The results look as expected and I'm using lsmeans to do some
follow-up
> analyses. For example, to compare accuracy by group and
condition, I'm
> doing this:
>
> acc.lsm <- lsmeans(acc.fm <http://acc.fm>, ~group|condition)
>
> pairs(acc.lsm)
>
>
[snip]
> Here is my model for the RT data
> (RT is a continuous variable so no logistic regression here):
>
> rt.fm <http://rt.fm> = lmer(rt ~ location +
group*condition*accuracy + (1|subject),
> data=my_data)
>
> The results from this regression look fine, but if I try this
. . .
>
> rt.lsm <- lsmeans(rt.fm <http://rt.fm> ~ group|condition)
>
> . . . or if I try to specify a reference grid like this . . .
>
> rt.rg <- ref.grid(rt.fm <http://rt.fm>)
>
> . . . my machine hangs.
>
[snip]
It's a little hard to say without a reproducible example, and
this question would probably be slightly more appropriate for
r-sig-mixed-models at r-project.org
<mailto:r-sig-mixed-models at r-project.org> (although I can't
actually tell
for sure whether it is an lme4-specific problem or a more general
ls.means::ref.grid question), but: how big a reference is ref.grid()
trying to construct? Is it fairly high-resolution/high-dimensional?
I would probably try some experiments with small subsets of your
data
to see how the results scale.
______________________________________________
R-help at r-project.org <mailto:R-help at r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
More information about the R-sig-mixed-models
mailing list