[R] GAM, GLM, Logit, infinite or missing values in 'x'

Prof Brian Ripley ripley at stats.ox.ac.uk
Tue Jan 8 09:18:51 CET 2008


On Tue, 8 Jan 2008, Anders Schwartz Corr wrote:

>
> Hi,
>
> I'm running gam (mgcv version 1.3-29) and glm (logit) (stats R 2.61) on
> the same models/data, and I got error messages for the gam() model and
> warnings for the glm() model.
>
> R-help suggested that the glm() warning messages are due to the model
> perfectly predicting binary output. Perhaps the model overfits the data? I
> inspected my data and it was not immediately obvious to me (though I guess
> it will be to some of the more pointed of you) how this would be the case.

Only the clairvoyant, given that you didn't supply the data.  But this 
concept of complete/partial separation is well-known in certain fields 
(more in AI than in statistics).  See my PRNN book for a comprehensive 
account, and

@Book{Santner.Duffy.89,
   author       = "T. J. Santner and D. E. Duffy",
   title        = "The Statistical Analysis of Discrete Data",
   publisher    = "Springer-Verlag",
   address      = "New York",
   year         = "1989",
   ISBN         = "0-387-97018-5",
   comment      = "Reference from MASS",
}

for a statistical book that covers it.

> The gam() errors vanish when I delete one covariate (it doesn't matter
> which one). Can I write a loop into the code such that if an error is
> returned (is.error() doesn't seem to exist unfortunately) then I pare off

See ?try : try comes very close to is.error.
Seealso ?tryCatch

> one of the covariates and rerun the gam()? That would be ideal. I could
> set options(error = f()) in which f() reruns the gam with
> one fewer covariate until it works, but the gam is in a bunch of loops
> that would break given the error and I would like to figure out another
> option.
>
> My glm and gam models are below. Any suggestions are very much
> appreciated.
>
> Best,
>
> Anders
>
>> form.logit
> outbinary ~ a_norm_total2 + I(a_norm_total2^2) + prop + igoprop +
>     gpconc + ter + open + igototal + cinc.nmc + demsOnumstat +
>     diversity + cincOter + polity2
>
>> form.glogit
> outbinary ~ s(a_norm_total2) + s(prop) + s(prop, by = a_norm_total2) +
>     igoprop + gpconc + ter + open + igototal + cinc.nmc + demsOnumstat +
>     diversity + cincOter + polity2
>
> GAM error message:
> avt.2glogit<-gam(form.glogit, data=dataS, na.action=na.omit,family=binomial)
> Error in eigen(hess1, symmetric = TRUE) :
>   infinite or missing values in 'x'
> Calls: gam -> gam.outer -> newton -> eigen
>
> GLM warnings:
> There were 29 warnings (use warnings() to see them)
>> warnings()
> Warning messages:
> 1: In glm.fit(x = X, y = Y, weights = weights, start = start,  ... :
>   fitted probabilities numerically 0 or 1 occurred
> 2: In glm.fit(x = X, y = Y, weights = weights, start = start,  ... :
>   fitted probabilities numerically 0 or 1 occurred
> 3: In glm.fit(x = X, y = Y, weights = weights, start = start,  ... :
>   fitted probabilities numerically 0 or 1 occurred
> 4: In glm.fit(x = X, y = Y, weights = weights, start = start,  ... :
>   fitted probabilities numerically 0 or 1 occurred
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

-- 
Brian D. Ripley,                  ripley at stats.ox.ac.uk
Professor of Applied Statistics,  http://www.stats.ox.ac.uk/~ripley/
University of Oxford,             Tel:  +44 1865 272861 (self)
1 South Parks Road,                     +44 1865 272866 (PA)
Oxford OX1 3TG, UK                Fax:  +44 1865 272595




More information about the R-help mailing list