[Rd] Support for user defined unary functions

Fri Mar 17 23:41:34 CET 2017

I can see that allowing a user-defined unary prefix operator can be useful.
We want to make sure its precedence and associative behavior are
convenient for a variety of envisioned uses, as we won't get a chance
to change them after the language construct is introduced.

An example of precedence getting in the way is when you would like
to define a matrix-exponentiation operator, %^%.  It will have the same
precedence as %% so
   - x %^% 2
will be equivalent to
   (-x) %^% 2
and not to
   - (x %^% 2)

[A long time ago someone wanted new postfix operators (e.g., bang for
factorial).  Is that desire still around?]

Bill Dunlap
TIBCO Software
wdunlap tibco.com

On Fri, Mar 17, 2017 at 3:02 PM, Jim Hester <james.f.hester at gmail.com> wrote:
> The unquoting discussion is IMHO separate from this proposal and as
> you noted probably better served by a native operator with different
> precedence.
>
> I think the main benefit to providing user defined prefix operators is
> it allows package authors to experiment with operator ideas and gauge
> community interest. The current situation means any novel unary
> semantics either need to co-opt existing unary operators or propose
> changes to the R parser, neither of which is ideal for
> experimentation.
>
> The user defined pipe operator (%>%), now used by > 300 packages, is
> an example that giving package authors the power to experiment can
> produce beneficial ideas for the community.
>
> On Fri, Mar 17, 2017 at 9:46 AM, Gabriel Becker <gmbecker at ucdavis.edu> wrote:
>> Jim,
>>
>> One more note about precedence. It prevents a solution like the one you
>> proposed from solving all of the problems you cited. By my reckoning, a
>> "What comes next is for NSE" unary operator needs an extremely low
>> precedence, because it needs to greedily grab "everything" (or a large
>> amount) that comes after it. Normal-style unary operators, on the other
>> hand, explicitly don't want that.
>>
>> From what I can see, your patch provides support for the latter but not the
>> former.
>>
>> That said I think there are two issues here. One is can users define unary
>> operators. FWIW my opinion on that is roughly neutral to slightly positive.
>> The other issue is can we have quasi quotation of the type that Hadley and
>> Lionel need in the language. This could be solved without allowing
>> user-defined unary specials, and we would probably want it to be, as I doubt
>> ~ %!%x + %!%y + z is  particularly aesthetically appealing to most (it isn't
>> to me). I'd propose coopting unary @ for that myself. After off list
>> discussions with Jonathan Carrol and with Michael Lawrence I think it's
>> doable, unambiguous, and even imo pretty intuitive for an "unquote"
>> operator.
>>
>> Best,
>> ~G
>>
>> On Fri, Mar 17, 2017 at 5:10 AM, Jim Hester <james.f.hester at gmail.com>
>> wrote:
>>>
>>> I agree there is no reason they _need_ to be the same precedence, but
>>> I think SPECIALS are already have the proper precedence for both unary
>>> and binary calls. Namely higher than all the binary operators (except
>>> for `:`), but lower than the other unary operators. Even if we gave
>>> unary specials their own precedence I think it would end up in the
>>> same place.
>>>
>>>     `%l%` <- function(x) tail(x, n = 1)
>>>     %l% 1:5
>>>     #> [1] 5
>>>     %l% -5:-10
>>>     #> [1] -10
>>>
>>> On Thu, Mar 16, 2017 at 6:57 PM, William Dunlap <wdunlap at tibco.com> wrote:
>>> > I am biased against introducing new syntax, but if one is
>>> > experimenting with it one should make sure the precedence feels right.
>>> > I think the unary and binary minus-sign operators have different
>>> > precedences so I see no a priori reason to make the unary and binary
>>> > %xxx% operators to be the same.
>>> > Bill Dunlap
>>> > TIBCO Software
>>> > wdunlap tibco.com
>>> >
>>> >
>>> > On Thu, Mar 16, 2017 at 3:18 PM, Michael Lawrence
>>> > <lawrence.michael at gene.com> wrote:
>>> >> I guess this would establish a separate "namespace" of symbolic prefix
>>> >> operators, %*% being an example in the infix case. So you could have
>>> >> stuff
>>> >> like %?%, but for non-symbolic (spelled out stuff like %foo%), it's
>>> >> hard to
>>> >> see the advantage vs. foo(x).
>>> >>
>>> >> Those examples you mention should probably be addressed (eventually) in
>>> >> the
>>> >> core language, and it looks like people are already able to experiment,
>>> >> so
>>> >> I'm not sure there's a significant impetus for this change.
>>> >>
>>> >> Michael
>>> >>
>>> >>
>>> >> On Thu, Mar 16, 2017 at 10:51 AM, Jim Hester <james.f.hester at gmail.com>
>>> >> wrote:
>>> >>
>>> >>> I used the `function(x)` form to explicitly show the function was
>>> >>> being called with only one argument, clearly performance implications
>>> >>> are not relevant for these examples.
>>> >>>
>>> >>> I think of this mainly as a gap in the tooling we provide users and
>>> >>> package authors. R has native prefix `+1`, functional `f(1)` and infix
>>> >>> `1 + 1` operators, but we only provide a mechanism to create user
>>> >>> defined functional and infix operators.
>>> >>>
>>> >>> One could also argue that the user defined infix operators are also
>>> >>> ugly and could be replaced by `f(a, b)` calls as well; beauty is in
>>> >>> the eye of the beholder.
>>> >>>
>>> >>> The unquote example [1] shows one example where this gap in tooling
>>> >>> caused authors to co-opt existing unary exclamation operator, this
>>> >>> same gap is part of the reason the formula [2] and question mark [3]
>>> >>> operators have been used elsewhere in non standard contexts.
>>> >>>
>>> >>> If the language provided package authors with a native way to create
>>> >>> unary operators like it already does for the other operator types
>>> >>> these machinations would be unnecessary.
>>> >>>
>>> >>> [1]: https://github.com/hadley/rlang/blob/master/R/tidy-unquote.R#L17
>>> >>> [2]: https://cran.r-project.org/package=ensurer
>>> >>> [3]: https://cran.r-project.org/package=types
>>> >>>
>>> >>> On Thu, Mar 16, 2017 at 1:04 PM, Gabriel Becker <gmbecker at ucdavis.edu>
>>> >>> wrote:
>>> >>> > Martin,
>>> >>> >
>>> >>> > Jim can speak directly to his motivations; I don't claim to be able
>>> >>> > to do
>>> >>> > so. That said, I suspect this is related to a conversation on
>>> >>> > twitter
>>> >>> about
>>> >>> > wanting an infix "unquote" operator in the context of the
>>> >>> > non-standard
>>> >>> > evaluation framework Hadley Wickham and Lionel Henry (and possibly
>>> >>> others)
>>> >>> > are working on.
>>> >>> >
>>> >>> > They're currently using !!! and !! for things related to this, but
>>> >>> > this
>>> >>> > effectively requires non-standard parsing, as ~!!x is interpreted as
>>> >>> > ~(`!!`(x)) rather than ~(!(!(x)) as the R parser understands it.
>>> >>> > Others
>>> >>> and
>>> >>> > I pointed out this was less than desirable, but if something like it
>>> >>> > was
>>> >>> > going to happen it would hopefully happen in the language
>>> >>> > specification,
>>> >>> > rather than in a package (and also hopefully not using !!
>>> >>> > specifically).
>>> >>> >
>>> >>> > Like you, I actually tend to prefer the functional form myself in
>>> >>> > most
>>> >>> > cases. There are functional forms that would work for the above case
>>> >>> (e.g.,
>>> >>> > something like the .() that DBI uses), but that's probably off topic
>>> >>> here,
>>> >>> > and not a decision I'm directly related to anyway.
>>> >>> >
>>> >>> > Best,
>>> >>> > ~G
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>> > On Thu, Mar 16, 2017 at 9:51 AM, Martin Maechler
>>> >>> > <maechler at stat.math.ethz.ch> wrote:
>>> >>> >>
>>> >>> >> >>>>> Jim Hester <james.f.hester at gmail.com>
>>> >>> >> >>>>>     on Thu, 16 Mar 2017 12:31:56 -0400 writes:
>>> >>> >>
>>> >>> >>     > Gabe,
>>> >>> >>     > The unary functions have the same precedence as normal
>>> >>> >> SPECIALS
>>> >>> >>     > (although the new unary forms take precedence over binary
>>> >>> SPECIALS).
>>> >>> >>     > So they are lower precedence than unary + and -. Yes, both of
>>> >>> >> your
>>> >>> >>     > examples are valid with this patch, here are the results and
>>> >>> quoted
>>> >>> >>     > forms to see the precedence.
>>> >>> >>
>>> >>> >>     > `%chr%` <- function(x) as.character(x)
>>> >>> >>
>>> >>> >>   [more efficient would be     `%chr%` <- as.character]
>>> >>> >>
>>> >>> >>     > `%identical%` <- function(x, y) identical(x, y)
>>> >>> >>     > quote("100" %identical% %chr% 100)
>>> >>> >>     > #>  "100" %identical% (`%chr%`(100))
>>> >>> >>
>>> >>> >>     > "100" %identical% %chr% 100
>>> >>> >>     > #> [1] TRUE
>>> >>> >>
>>> >>> >>     > `%num%` <- as.numeric
>>> >>> >>     > quote(1 + - %num% "5")
>>> >>> >>     > #> 1 + -(`%num%`("5"))
>>> >>> >>
>>> >>> >>     > 1 + - %num% "5"
>>> >>> >>     > #> [1] -4
>>> >>> >>
>>> >>> >>     > Jim
>>> >>> >>
>>> >>> >> I'm sorry to be a bit of a spoiler to "coolness", but
>>> >>> >> you may know that I like to  applaud Norm Matloff for his book
>>> >>> >> title "The Art of R Programming",
>>> >>> >> because for me good code should also be beautiful to some extent.
>>> >>> >>
>>> >>> >> I really very much prefer
>>> >>> >>
>>> >>> >>        f(x)
>>> >>> >> to    %f% x
>>> >>> >>
>>> >>> >> and hence I really really really cannot see why anybody would
>>> >>> >> prefer
>>> >>> >> the ugliness of
>>> >>> >>
>>> >>> >>            1 + - %num% "5"
>>> >>> >> to
>>> >>> >>            1 + -num("5")
>>> >>> >>
>>> >>> >> (after setting  num <- as.numeric )
>>> >>> >>
>>> >>> >> Martin
>>> >>> >>
>>> >>> >>
>>> >>> >>     > On Thu, Mar 16, 2017 at 12:01 PM, Gabriel Becker
>>> >>> >> <gmbecker at ucdavis.edu> wrote:
>>> >>> >>     >> Jim,
>>> >>> >>     >>
>>> >>> >>     >> This seems cool. Thanks for proposing it. To be concrete, he
>>> >>> >> user-defined
>>> >>> >>     >> unary operations would be of the same precedence (or just
>>> >>> slightly
>>> >>> >> below?)
>>> >>> >>     >> built-in unary ones? So
>>> >>> >>     >>
>>> >>> >>     >> "100" %identical% %chr% 100
>>> >>> >>     >>
>>> >>> >>     >> would work and return TRUE under your patch?
>>> >>> >>     >>
>>> >>> >>     >> And  with %num% <- as.numeric, then
>>> >>> >>     >>
>>> >>> >>     >> 1 + - %num% "5"
>>> >>> >>     >>
>>> >>> >>     >> would also be legal (though quite ugly imo) and work?
>>> >>> >>     >>
>>> >>> >>     >> Best,
>>> >>> >>     >> ~G
>>> >>> >>     >>
>>> >>> >>     >> On Thu, Mar 16, 2017 at 7:24 AM, Jim Hester
>>> >>> >> <james.f.hester at gmail.com>
>>> >>> >>     >> wrote:
>>> >>> >>     >>>
>>> >>> >>     >>> R has long supported user defined binary (infix) functions,
>>> >>> >> defined
>>> >>> >>     >>> with `%fun%`. A one line change [1] to R's grammar allows
>>> >>> >> users
>>> >>> to
>>> >>> >>     >>> define unary (prefix) functions in the same manner.
>>> >>> >>     >>>
>>> >>> >>     >>> `%chr%` <- function(x) as.character(x)
>>> >>> >>     >>> `%identical%` <- function(x, y) identical(x, y)
>>> >>> >>     >>>
>>> >>> >>     >>> %chr% 100
>>> >>> >>     >>> #> [1] "100"
>>> >>> >>     >>>
>>> >>> >>     >>> %chr% 100 %identical% "100"
>>> >>> >>     >>> #> [1] TRUE
>>> >>> >>     >>>
>>> >>> >>     >>> This seems a natural extension of the existing
>>> >>> >> functionality and
>>> >>> >>     >>> requires only a minor change to the grammar. If this change
>>> >>> seems
>>> >>> >>     >>> acceptable I am happy to provide a complete patch with
>>> >>> >> suitable
>>> >>> >> tests
>>> >>> >>     >>> and documentation.
>>> >>> >>     >>>
>>> >>> >>     >>> [1]:
>>> >>> >>     >>> Index: src/main/gram.y
>>> >>> >>     >>>
>>> >>> >> ===================================================================
>>> >>> >>     >>> --- src/main/gram.y     (revision 72358)
>>> >>> >>     >>> +++ src/main/gram.y     (working copy)
>>> >>> >>     >>> @@ -357,6 +357,7 @@
>>> >>> >>     >>> |       '+' expr %prec UMINUS           { $$ =
>>> >>> >> xxunary($1,$2);
>>> >>> >>     >>> setId( $$, @$); }
>>> >>> >>     >>> |       '!' expr %prec UNOT             { $$ =
>>> >>> >> xxunary($1,$2);
>>> >>> >>     >>> setId( $$, @$); }
>>> >>> >>     >>> |       '~' expr %prec TILDE            { $$ =
>>> >>> >> xxunary($1,$2);
>>> >>> >>     >>> setId( $$, @$); }
>>> >>> >>     >>> +       |       SPECIAL expr                    { $$ =
>>> >>> >> xxunary($1,$2);
>>> >>> >>     >>> setId( $$, @$); }
>>> >>> >>     >>> |       '?' expr                        { $$ =
>>> >>> >> xxunary($1,$2);
>>> >>> >>     >>> setId( $$, @$); }
>>> >>> >>     >>>
>>> >>> >>     >>> |       expr ':'  expr                  { $$ =
>>> >>> >>     >>> xxbinary($2,$1,$3);      setId( $$, @$); }
>>> >>> >>     >>>
>>> >>> >>     >>> ______________________________________________
>>> >>> >>     >>> R-devel at r-project.org mailing list
>>> >>> >>     >>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> >>> >>     >>
>>> >>> >>     >>
>>> >>> >>     >>
>>> >>> >>     >>
>>> >>> >>     >> --
>>> >>> >>     >> Gabriel Becker, PhD
>>> >>> >>     >> Associate Scientist (Bioinformatics)
>>> >>> >>     >> Genentech Research
>>> >>> >>
>>> >>> >>     > ______________________________________________
>>> >>> >>     > R-devel at r-project.org mailing list
>>> >>> >>     > https://stat.ethz.ch/mailman/listinfo/r-devel
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>> >
>>> >>> > --
>>> >>> > Gabriel Becker, PhD
>>> >>> > Associate Scientist (Bioinformatics)
>>> >>> > Genentech Research
>>> >>>
>>> >>> ______________________________________________
>>> >>> R-devel at r-project.org mailing list
>>> >>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>> >>>
>>> >>
>>> >>         [[alternative HTML version deleted]]
>>> >>
>>> >> ______________________________________________
>>> >> R-devel at r-project.org mailing list
>>> >> https://stat.ethz.ch/mailman/listinfo/r-devel
>>>
>>> ______________________________________________
>>> R-devel at r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>
>>
>>
>>
>> --
>> Gabriel Becker, PhD
>> Associate Scientist (Bioinformatics)
>> Genentech Research