[R] Understanding the intercept value in a multiple linear regression with categorical values
Joao Azevedo
joao.c.azevedo at gmail.com
Fri Jul 27 14:16:10 CEST 2012
Hi!
Thanks for the link. I've already stumbled upon that explanation. I'm
able to understand how the coding schemes are applied in the supplied
examples, but they only use a single explanatory variable. My problem
is with understanding the model when there are multiple categorical
explanatory variables.
--
Joao.
On Fri, Jul 27, 2012 at 1:04 PM, Jean V Adams <jvadams at usgs.gov> wrote:
> Joao,
>
> There's a very thorough explanation at
> http://www.ats.ucla.edu/stat/r/library/contrast_coding.htm
>
> Jean
>
>
> Joao Azevedo <joao.c.azevedo at gmail.com> wrote on 07/27/2012 06:32:31 AM:
>
>>
>> Hi!
>>
>> I'm failing to understand the value of the intercept value in a
>> multiple linear regression with categorical values. Taking the
>> "warpbreaks" data set as an example, when I do:
>>
>> > lm(breaks ~ wool, data=warpbreaks)
>>
>> Call:
>> lm(formula = breaks ~ wool, data = warpbreaks)
>>
>> Coefficients:
>> (Intercept) woolB
>> 31.037 -5.778
>>
>> I'm able to understand that the value of intercept is the mean value
>> of breaks when wool equals "A", and that adding up the "woolB"
>> coefficient to the intercept value I get the mean value of breaks when
>> wool equals "B". However, if I also consider the tension variable in
>> the model, I'm unable to figure out the meaning of the intercept
>> value:
>>
>> > lm(breaks ~ wool + tension, data=warpbreaks)
>>
>> Call:
>> lm(formula = breaks ~ wool + tension, data = warpbreaks)
>>
>> Coefficients:
>> (Intercept) woolB tensionM tensionH
>> 39.278 -5.778 -10.000 -14.722
>>
>> I thought it would be the mean value of breaks when either wool equals
>> "A" or tension equals "L", but that isn't true for this dataset.
>>
>> Any clues on interpreting the value of intercept?
>>
>> Thanks!
>>
>> --
>> Joao.
More information about the R-help
mailing list