[R-sig-ME] Influence of the random effects on fixed effect estimates in mixed models and interpretation of fixed effects in relation to random effects.
Tom Wilding
Tom.Wilding at sams.ac.uk
Mon Sep 8 17:11:36 CEST 2014
Dear All
I have previously asked this question on StackExchange with no feedback thus far.
http://stats.stackexchange.com/questions/112030/why-and-how-does-the-inclusion-of-random-effects-in-mixed-models-influence-the-f
I would like to repeat this question here as ongoing research has not revealed any answers. My question is about the influence that the random terms have on the fixed effect (e.g. intercept) estimates and how to interpret the intercepts when different random terms (e.g. random intercept v random slope) are included in the model.
The following code can be run to illustrate my question:
library(lme4)
library(faraway)
data(epilepsy)
log(mean(epilepsy$seizures))#expected intercept in intercept only model = 2.5544
(Ep1a=glm(seizures~1,family=poisson,data=epilepsy))#intercept term =2.554 as expected.
(Ep1b=glmer(seizures~1+(1|id),family=poisson,data=epilepsy))#intercept term =2.214.
My understanding is that the inclusion of the random term (id) tells the model that there is a repeated measure across subject (in this case). I can understand that this allows for the non-independence of the data: there are fewer than n=295 independent data points. But why does the fixed-effect intercept value decrease? Is the decrease in this case because the model has 'more confidence' in the observations from 'id' which were lower than the mean? If so, is this because the variance =mean in a Poisson distribution?
I note from the following website: http://www.danielezrajohnson.com/glasgow_workshop.R the following in relation to a model unrelated to the one i've specified above (suggest you search for "average speaker"):
"...this model has random effects for speaker and word. The fixed effects reported are for a sort of average speaker and word. However, word, especially, tends to be a very skewed variable. There will always be a few very common words, that may favor or disfavor the response. The mixed model largely counteracts this weighting."
In my real example (for more details see the StackExchange question, link above), all the coefficients are considerably less (2-3 units in log scale) than the corresponding mean values for those factor combinations as apparent in the raw data. I'm struggling to justify this but in attempting to do this I've run some simulations (albeit run using nlme - clunky code available from me which plots the raw data and various models). In a simulation where there are two random 'sites' (I appreciate that this is many fewer than 'allowed'), where there is a random slope effect and where an intercept-only model allowing a random slope is fitted, the fixed-effect intercept term is the Y-axis value where the slopes for these two 'sites' meet (i.e. cross). I had anticipated it to the be mean of the slope-intercepts at the predictor value of zero. This means that if the two random slopes happen to run in near parallel then the intercept term output by the model can be 'way-off' - the individual regression lines cross at some distance from the mean value of the data set. I'm not sure what this means in terms of a more realistic 10 plus sites (or the >300 I have in my real data set) - but I note that my real data is zero-inflated and wonder if additional weight is given to those sites with characterised by low counts, possibly because the variance associated with low values is also low??
What is represented by the intercept (and other terms in a factorial model) in relation to the random effects? Any pointers on this would be much appreciated.
Thanks
Tom.
The Scottish Association for Marine Science (SAMS) is registered in Scotland as a Company Limited by Guarantee (SC009292) and is a registered charity (9206). SAMS has an actively trading wholly owned subsidiary company: SAMS Research Services Ltd a Limited Company (SC224404). All Companies in the group are registered in Scotland and share a registered office at Scottish Marine Institute, Oban Argyll PA37 1QA. The content of this message may contain personal views which are not the views of SAMS unless specifically stated. Please note that all email traffic is monitored for purposes of security and spam filtering. As such individual emails may be examined in more detail.
[[alternative HTML version deleted]]
More information about the R-sig-mixed-models
mailing list