[R] degree of freedom GLM

Sun Jul 8 01:06:19 CEST 2012

Please use 'Reply All' in your responses. Others may be better
able to help than I.

See comments inline below.

On 2012-07-06 04:46, Jennifer Kaiser wrote:
>>You probably intended all the variables that are of
>>type "integer" (e.g. FAHRL_C) to be _factors_. My guess
>>is that, for ease of data entry, you coded these with
>>integers 1-7.
>
> Yes, that helped. Thank you very much!
> But I  have another question.
> Now I use this:
>
>> Tabelle <- read.csv("C:\\Users\\Public\\Documents\\Bachelorarbeit\\eingaben-drittel\\eingabe8_positiv-123.csv" , header = T , sep=";")
>>
>> sb_ek_ber <-   Tabelle$sb_ek_ber
>> ALTERKAU_C <- as.factor(Tabelle$ALTERKAU_C)
>> JE_gewichtet <- Tabelle$JE_gewichtet
>> Alter_Jüngster_C_inkl_AlterNutz <- as.factor(Tabelle$Alter_Jüngster_C_inkl_AlterNutz)
>> NUTZKREIS  <- as.factor(Tabelle$NUTZKREIS)
>> RKL_U12 <-  as.factor(Tabelle$RKL_U12)
>> SF_Sonder_aufgefüllt <- as.factor(Tabelle$SF_Sonder_aufgefüllt)
>> schw_drittel_c <- as.factor(Tabelle$schw_drittel_c)
>>
>> Tabelle2 <- data.frame(sb_ek_ber, ALTERKAU_C  ,JE_gewichtet, Alter_Jüngster_C_inkl_AlterNutz, NUTZKREIS, RKL_U12,
> SF_Sonder_aufgefüllt, schw_drittel_c)

You could have saved yourself a bit of effort by using
the 'colClasses' argument to read.csv.

>> ypoi <- glm(formula= sb_ek_ber ~1+ ALTERKAU_C + Alter_Jüngster_C_inkl_AlterNutz +  NUTZKREIS+ RKL_U12  ,data=Tabelle2 , family = poisson(link=log))
>
>> drop1(ypoi, test="Chisq")
> Single term deletions
>
> Model:
> sb_ek_ber ~ 1 + ALTERKAU_C + Alter_Jüngster_C_inkl_AlterNutz +
>      NUTZKREIS + RKL_U12
> Df   Deviance AIC       LRT  Pr(>Chi)
> <none> 1.5513e+10 Inf
> ALTERKAU_C   7 1.5604e+10 Inf  91365338 < 2.2e-16 ***
> Alter_Jüngster_C_inkl_AlterNutz   9 1.5754e+10 Inf 240862295 < 2.2e-16 ***
> NUTZKREIS   3 1.5588e+10 Inf  74676698 < 2.2e-16 ***
> RKL_U12 12 1.5599e+10 Inf  86395303 < 2.2e-16 ***
> ---
> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> There were 50 or more warnings (use warnings() to see the first 50)
>
> So what I get is an one-way analysis of variance of the individual
> character.
> But I want to get an analysis of the whole GLM with all the
> characteristics and the change of the df

So why are you using drop1?
Why not anova(ypoi)?

If you want "type II" anova, look at the Anova()
function (NB: captial 'A') in the car package.

Peter Ehlers

>
> It would be really great if you coud help me.
>
>
>
>
>
>
> ------------------------------------------------------------------------
> *Von:* Peter Ehlers <ehlers at ucalgary.ca>
> *An:* Jennifer Kaiser <jennifer.kaiser1988 at yahoo.de>
> *CC:* "r-help at r-project.org" <r-help at r-project.org>
> *Gesendet:* 0:46 Dienstag, 3.Juli 2012
> *Betreff:* Re: [R] degree of freedom GLM
>
> On 2012-07-02 02:37, Jennifer Kaiser wrote:
>  > Hi,
>  > I have a problem with the df.
>  > I read in a big csv file.
>  >
>  > Tabelle <-
> read.csv("C:\\Users\\Public\\Documents\\Bachelorarbeit\\eingabe8_durchnummeriert.csv"
> , header = T , sep=";")
>  >
>  >
>  > then I try this:
>  >
>  >> ygamma <- glm(Tabelle$sb_ek_ber ~1+ Tabelle$FAHRL_C +
> Tabelle$NUTZKREIS + Tabelle$schw_drittel_c  , family = Gamma)
>  >
>  >>  anova(ygamma, test="Chisq")
>  >
>  > Analysis of Deviance Table
>  >
>  > Model: Gamma, link: inverse
>  >
>  > Response: Tabelle$sb_ek_ber
>  >
>  > Terms added sequentially (first to last)
>  >
>  >
>  >                        Df Deviance Resid. Df Resid. Dev  Pr(>Chi)
>  > NULL                                    1236805  35451551
>  > Tabelle$FAHRL_C        1      33987  1236804  35417564 0.0018493 **
>  > Tabelle$NUTZKREIS      1      48903  1236803  35368661 0.0001880 ***
>  > Tabelle$schw_drittel_c  1      47328  1236802  35321334 0.0002388 ***
>  > ---
>  > Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
>  >
>  >> str(Tabelle)
>  > 'data.frame':  1236806 obs. of  9 variables:
>  >  $ Alter_Jüngster_C_inkl_AlterNutz: int  1 1 1 1 1 1 1 1 1 1 ...
>  >  $ ALTERKAU_C                    : int  1 2 2 1 3 3 3 4 1 1 ...
>  >  $ FAHRL_C                        : int  1 2 1 3 4 3 3 1 5 1 ...
>  >  $ NUTZKREIS                      : int  1 2 2 2 2 2 2 1 1 2 ...
>  >  $ RKL_U12                        : int  1 1 1 2 3 4 4 3 5 6 ...
>  >  $ SF_Sonder_aufgefüllt          : int  1 2 3 4 4 4 4 5 6 7 ...
>  >  $ schw_drittel_c                : int  1 2 3 4 3 3 3 3 1 1 ...
>  >  $ sb_ek_ber                      : num  0.001 0.001 0.001 0.001
> 0.001 0.001 0.001 0.001 0.001 0.001 ...
>  >  $ JE_gewichtet                  : num  0.384 3.952 3.952 2.81 3.952 ...
>  >
>  > I don't understand why the df are always 1.
>
> You probably intended all the variables that are of
> type "integer" (e.g. FAHRL_C) to be _factors_. My guess
> is that, for ease of data entry, you coded these with
> integers 1-7.
>
> You'll have to tell R that you want factors:
>
>    Tabelle$FAHRL_C <- factor(Tabelle$FAHRL_C)
>
> etc.
>
> Peter Ehlers
>
>  >
>  > it would be great if you could help me.
>  >     [[alternative HTML version deleted]]
>  >
>
>