[R-sig-ME] Is a mixed effects appropriate?

Sat Mar 16 19:08:25 CET 2013

Ross Ahmed <rossahmed at ...> writes:

> 
> I am looking at differences in dates of maximum counts of geese at 3 sites
> in the UK. I am testing to see if the date of maximum count is different
> between 3 sites.
> 
> My data look like these created in R:
> 
>   df <- data.frame(day=c(sample(70:80, 10), sample(75:85, 10), sample(80:90,
> 10)),
>                    year=rep(2000:2009, 3),
>                    site=paste('site', sort(rep(1:3, 10))))
> 
> Head of dataframe:
> 
>   day year   site
> 1  78 2000 site 1
> 2  76 2001 site 1
> 3  71 2002 site 1
> 4  73 2003 site 1
> 5  75 2004 site 1
> 6  74 2005 site 1
> 
> The variable 'day' is the day number on which the maximum count of geese was
> recorded in that year. So in rows 1, the the maximum count was recorded 78
> days from the 1st Jan in that year.
> 
> I considered carrying out a simple ANOVA with day as response variable and
> site as predictor variable. However I've become aware that this would
> violate the assumption of independence. Is there a mixed effects model that
> is able to handle the dependence of the data? Alternatively, would some sort
> of time series analysis be more appropriate here?
> 

 I would say that 

anova(lme(day~site, random=~1|year,data=df))

would be a reasonable model, allowing for random variation among years.

I do think it would make sense to look for trends across time

anova(lme(day~site*year, random=~1|year,data=df))

  (there is a fixed effect and a random effect of year in this
model, but it should be OK because the random effect treats year
as a categorical variable ... as long as the fixed effect is
a numeric (continuous) variable)

You should definitely make sure to look at graphical representations
of the data! But obviously your made-up data set doesn't
have anything to see in it ...