[R-sig-ME] Nested random effect or unbalanced design?
Leppänen Ilkka
ilkka.j.leppanen at aalto.fi
Thu Sep 4 15:34:16 CEST 2014
Dear list,
I am using lmer to identify a mixed effects model, but I am puzzled by whether my design has nested random effects or whether it is just unbalanced.
I have an experiment where subjects are repeatedly measured a physiological variable A. Each subject sees a specific level of a variable B once every time they are measured. Theory goes that higher B gives higher A.
The trick is that levels of B are drawn randomly from some distribution for each subject. These draws are not reproducible in other experiments, so B should be incorporated somehow as a random effect.
Now if I want to know how B affects A as a random effect, I start from the model
(M1) A ~ (1|B) + time + (1|subject)
But in this model the random effect B has a very low variance. So I specify B as a fixed effect with the model
(M2) A ~ B + time + (1|subject)
This model is desirable because I am mostly interested in stating how B affects A as a fixed effect, not as a nuisance.
However, if I understand nesting correctly, (M2) is not sufficient because subjects are nested in B. As a toy example, B has levels 1,2,3,4,5 and there are five repeated measurements. Subject 1 sees B={1,2,3,2,2}, subject 2 sees B={4,2,1,1,2}, subject 3 sees B={1,2,3,4,5}, and subject 4 sees only one level B={4,4,4,4,4}. A cross-tabulation of this looks like
> xtabs(~ B+subject, sparse=T)
5 x 4 sparse Matrix of class "dgCMatrix"
S1 S2 S3 S4
1 1 2 1 .
2 3 2 1 .
3 1 . 1 .
4 . 1 1 5
5 . . 1 .
Therefore, because each subject sees their unique set of levels of B, I would use the model with nested random effects
(M3) A ~ B + time + (1|subject) + (1|B:subject)
Here I get the result that the variance of the random effect B:subject is very small relative to variance of B or the residual. A likelihood ratio test does not see (M3) different than the simpler model (M2).
If I decide to go with (M2), I try to keep randomness maximal by examining the random slopes
(M4) A ~ B + time + (1+B|subject)
But here, the random intercept and slope are highly or perfectly correlated. My interpretation of this situation is that the by-subject random slopes are unidentifiable because of the nesting; for example, S4 in the toy example cannot have a "slope" for B.
How should I proceed? Should I just forget about the fact that B is a random effect and pretend that my design is unbalanced?
Thank you in advance,
Ilkka Leppänen
Aalto University
More information about the R-sig-mixed-models
mailing list