[Rd] named arguments in formula and terms
Achim Zeileis
Achim.Zeileis at uibk.ac.at
Mon Mar 13 16:26:16 CET 2017
Martin, thanks for the follow-up!
On Mon, 13 Mar 2017, Martin Maechler wrote:
> Dear Achim,
>
>>>>>> Achim Zeileis <Achim.Zeileis at r-project.org>
>>>>>> on Fri, 10 Mar 2017 15:02:38 +0100 writes:
>
> > Hi, we came across the following unexpected (for us)
> > behavior in terms.formula: When determining whether a term
> > is duplicated, only the order of the arguments in function
> > calls seems to be checked but not their names. Thus the
> > terms f(x, a = z) and f(x, b = z) are deemed to be
> > duplicated and one of the terms is thus dropped.
>
> R> attr(terms(y ~ f(x, a = z) + f(x, b = z)), "term.labels")
> > [1] "f(x, a = z)"
>
> > However, changing the arguments or the order of arguments
> > keeps both terms:
>
> R> attr(terms(y ~ f(x, a = z) + f(x, b = zz)), "term.labels")
> > [1] "f(x, a = z)" "f(x, b = zz)"
> R> attr(terms(y ~ f(x, a = z) + f(b = z, x)), "term.labels")
> > [1] "f(x, a = z)" "f(b = z, x)"
>
> > Is this intended behavior or needed for certain terms?
>
> > We came across this problem when setting up certain smooth
> > regressors with different kinds of patterns. As a trivial
> > simplified example we can generate the same kind of
> > problem with rep(). Consider the two dummy variables rep(x
> > = 0:1, each = 4) and rep(x = 0:1, times = 4). With the
> > response y = 1:8 I get:
>
> R> lm((1:8) ~ rep(x = 0:1, each = 4) + rep(x = 0:1, times = 4))
>
> > Call: lm(formula = (1:8) ~ rep(x = 0:1, each = 4) + rep(x
> > = 0:1, times = 4))
>
> > Coefficients: (Intercept) rep(x = 0:1, each = 4) 2.5 4.0
>
> > So while the model is identified because the two
> > regressors are not the same, terms.fomula does not
> > recognize this and drops the second regressor. What I
> > would have wanted can be obtained by switching the
> > arguments:
>
> R> lm((1:8) ~ rep(each = 4, x = 0:1) + rep(x = 0:1, times =4))
>
> > Call: lm(formula = (1:8) ~ rep(each = 4, x = 0:1) + rep(x
> > = 0:1, times = 4))
>
> > Coefficients: (Intercept) rep(each = 4, x = 0:1) rep(x =
> > 0:1, times = 4) 2 4 1
>
> > Of course, here I could avoid the problem by setting up
> > proper factors etc. But to me this looks a potential bug
> > in terms.formula...
>
> I agree that there is a bug.
OK, good. I just wasn't sure whether I had missed some documentation
somewhere that this is intended behavior.
> According to https://www.r-project.org/bugs.html
> I have generated an R bugzilla account for you so you can report
> it there (for "book keeping", posteriority, etc).
Thanks, I had already looked at that but waited for feedback on this list
first.
> > Thanks in advance for any insights, Z
>
> and thank *you* (and Nikolaus ?) for the report!
No problem. Niki found the problem and I came up with the simplified
example. In any case, I just posted a slightly modified version of my
e-mail as #17235 on Bugzilla:
https://bugs.R-project.org/bugzilla/show_bug.cgi?id=17235
Thanks & best wishes,
Z
> Best regards,
> Martin
>
>
More information about the R-devel
mailing list