[R] degree of freedom GLM
Peter Ehlers
ehlers at ucalgary.ca
Sun Jul 8 01:06:19 CEST 2012
Please use 'Reply All' in your responses. Others may be better
able to help than I.
See comments inline below.
On 2012-07-06 04:46, Jennifer Kaiser wrote:
>>You probably intended all the variables that are of
>>type "integer" (e.g. FAHRL_C) to be _factors_. My guess
>>is that, for ease of data entry, you coded these with
>>integers 1-7.
>
> Yes, that helped. Thank you very much!
> But I have another question.
> Now I use this:
>
>> Tabelle <- read.csv("C:\\Users\\Public\\Documents\\Bachelorarbeit\\eingaben-drittel\\eingabe8_positiv-123.csv" , header = T , sep=";")
>>
>> sb_ek_ber <- Tabelle$sb_ek_ber
>> ALTERKAU_C <- as.factor(Tabelle$ALTERKAU_C)
>> JE_gewichtet <- Tabelle$JE_gewichtet
>> Alter_Jüngster_C_inkl_AlterNutz <- as.factor(Tabelle$Alter_Jüngster_C_inkl_AlterNutz)
>> NUTZKREIS <- as.factor(Tabelle$NUTZKREIS)
>> RKL_U12 <- as.factor(Tabelle$RKL_U12)
>> SF_Sonder_aufgefüllt <- as.factor(Tabelle$SF_Sonder_aufgefüllt)
>> schw_drittel_c <- as.factor(Tabelle$schw_drittel_c)
>>
>> Tabelle2 <- data.frame(sb_ek_ber, ALTERKAU_C ,JE_gewichtet, Alter_Jüngster_C_inkl_AlterNutz, NUTZKREIS, RKL_U12,
> SF_Sonder_aufgefüllt, schw_drittel_c)
You could have saved yourself a bit of effort by using
the 'colClasses' argument to read.csv.
>> ypoi <- glm(formula= sb_ek_ber ~1+ ALTERKAU_C + Alter_Jüngster_C_inkl_AlterNutz + NUTZKREIS+ RKL_U12 ,data=Tabelle2 , family = poisson(link=log))
>
>> drop1(ypoi, test="Chisq")
> Single term deletions
>
> Model:
> sb_ek_ber ~ 1 + ALTERKAU_C + Alter_Jüngster_C_inkl_AlterNutz +
> NUTZKREIS + RKL_U12
> Df Deviance AIC LRT Pr(>Chi)
> <none> 1.5513e+10 Inf
> ALTERKAU_C 7 1.5604e+10 Inf 91365338 < 2.2e-16 ***
> Alter_Jüngster_C_inkl_AlterNutz 9 1.5754e+10 Inf 240862295 < 2.2e-16 ***
> NUTZKREIS 3 1.5588e+10 Inf 74676698 < 2.2e-16 ***
> RKL_U12 12 1.5599e+10 Inf 86395303 < 2.2e-16 ***
> ---
> Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> There were 50 or more warnings (use warnings() to see the first 50)
>
> So what I get is an one-way analysis of variance of the individual
> character.
> But I want to get an analysis of the whole GLM with all the
> characteristics and the change of the df
So why are you using drop1?
Why not anova(ypoi)?
If you want "type II" anova, look at the Anova()
function (NB: captial 'A') in the car package.
Peter Ehlers
>
> It would be really great if you coud help me.
>
>
>
>
>
>
> ------------------------------------------------------------------------
> *Von:* Peter Ehlers <ehlers at ucalgary.ca>
> *An:* Jennifer Kaiser <jennifer.kaiser1988 at yahoo.de>
> *CC:* "r-help at r-project.org" <r-help at r-project.org>
> *Gesendet:* 0:46 Dienstag, 3.Juli 2012
> *Betreff:* Re: [R] degree of freedom GLM
>
> On 2012-07-02 02:37, Jennifer Kaiser wrote:
> > Hi,
> > I have a problem with the df.
> > I read in a big csv file.
> >
> > Tabelle <-
> read.csv("C:\\Users\\Public\\Documents\\Bachelorarbeit\\eingabe8_durchnummeriert.csv"
> , header = T , sep=";")
> >
> >
> > then I try this:
> >
> >> ygamma <- glm(Tabelle$sb_ek_ber ~1+ Tabelle$FAHRL_C +
> Tabelle$NUTZKREIS + Tabelle$schw_drittel_c , family = Gamma)
> >
> >> anova(ygamma, test="Chisq")
> >
> > Analysis of Deviance Table
> >
> > Model: Gamma, link: inverse
> >
> > Response: Tabelle$sb_ek_ber
> >
> > Terms added sequentially (first to last)
> >
> >
> > Df Deviance Resid. Df Resid. Dev Pr(>Chi)
> > NULL 1236805 35451551
> > Tabelle$FAHRL_C 1 33987 1236804 35417564 0.0018493 **
> > Tabelle$NUTZKREIS 1 48903 1236803 35368661 0.0001880 ***
> > Tabelle$schw_drittel_c 1 47328 1236802 35321334 0.0002388 ***
> > ---
> > Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> >
> >> str(Tabelle)
> > 'data.frame': 1236806 obs. of 9 variables:
> > $ Alter_Jüngster_C_inkl_AlterNutz: int 1 1 1 1 1 1 1 1 1 1 ...
> > $ ALTERKAU_C : int 1 2 2 1 3 3 3 4 1 1 ...
> > $ FAHRL_C : int 1 2 1 3 4 3 3 1 5 1 ...
> > $ NUTZKREIS : int 1 2 2 2 2 2 2 1 1 2 ...
> > $ RKL_U12 : int 1 1 1 2 3 4 4 3 5 6 ...
> > $ SF_Sonder_aufgefüllt : int 1 2 3 4 4 4 4 5 6 7 ...
> > $ schw_drittel_c : int 1 2 3 4 3 3 3 3 1 1 ...
> > $ sb_ek_ber : num 0.001 0.001 0.001 0.001
> 0.001 0.001 0.001 0.001 0.001 0.001 ...
> > $ JE_gewichtet : num 0.384 3.952 3.952 2.81 3.952 ...
> >
> > I don't understand why the df are always 1.
>
> You probably intended all the variables that are of
> type "integer" (e.g. FAHRL_C) to be _factors_. My guess
> is that, for ease of data entry, you coded these with
> integers 1-7.
>
> You'll have to tell R that you want factors:
>
> Tabelle$FAHRL_C <- factor(Tabelle$FAHRL_C)
>
> etc.
>
> Peter Ehlers
>
> >
> > it would be great if you could help me.
> > [[alternative HTML version deleted]]
> >
>
>
More information about the R-help
mailing list