[Rd] Suggestions for 'diff.default'
Suharto Anggono Suharto Anggono
suharto_anggono at yahoo.com
Tue Jan 29 04:32:33 CET 2013
--- On Mon, 28/1/13, Suharto Anggono Suharto Anggono <suharto_anggono at yahoo.com> wrote:
> From: Suharto Anggono Suharto Anggono <suharto_anggono at yahoo.com>
> Subject: Suggestions for 'diff.default'
> To: R-devel at lists.R-project.org
> Date: Monday, 28 January, 2013, 5:31 PM
> I have suggestions for function
> 'diff.default' in R.
>
>
> Suggestion 1: If the input is matrix, always return matrix,
> even if empty.
>
> What happens in R 2.15.2:
>
> > rbind(1:2) # matrix
> [,1] [,2]
> [1,] 1 2
> > diff(rbind(1:2)) # not matrix
> integer(0)
> > sessionInfo()
> R version 2.15.2 (2012-10-26)
> Platform: i386-w64-mingw32/i386 (32-bit)
>
> locale:
> [1] LC_COLLATE=English_United States.1252
> [2] LC_CTYPE=English_United States.1252
> [3] LC_MONETARY=English_United States.1252
> [4] LC_NUMERIC=C
> [5] LC_TIME=English_United States.1252
>
> attached base packages:
> [1] stats graphics grDevices
> utils datasets
> methods base
>
>
> The documentation for 'diff' says, "If 'x' is a matrix then
> the difference operations are carried out on each column
> separately."
> If the result is empty, I expect that the result still has
> as many columns as the input.
>
>
> Suggestion 2: Make 'diff.default' applicable more generally
> by
> (a) not performing 'unclass';
> (b) generalizing (changing)
> ismat <- is.matrix(x)
> to become
> ismat <- length(dim(x)) == 2L
>
>
> If suggestion 1 is to be applied, if 'unclass' is not wanted
> (point (a) in suggestion 2 is also to be applied),
>
> if (lag * differences >= xlen)
> return(x[0L])
>
> can be changed to
>
> if (lag * differences >= xlen)
> return(
> if (ismat) x[0L, ,
> drop = FALSE] - x[0L, , drop = FALSE] else
> x[0L] - x[0L])
>
> It will handle class where subtraction (minus) operation
> change class.
Sorry, I wasn't careful enough. To obtain the correct class for the result, differencing should be done as many times as specified by argument 'differences'.
I consider the case of
diff(as.POSIXct(c("2012-01-01", "2012-02-01"), tz="UTC"), d=2)
versus
diff(diff(as.POSIXct(c("2012-01-01", "2012-02-01"), tz="UTC")))
To be safe, maybe just compute as usual, even when it is known that the end result will be empty. It can be done like this.
empty <- integer()
if (ismat)
for (i in seq_len(differences))
r <- if (lag >= nrow(r))
r[empty, , drop = FALSE] - r[empty, , drop = FALSE] else
...
else
for (i in seq_len(differences))
r <- if (lag >= length(r))
r[empty] - r[empty] else
...
If that way is used, 'xlen' is no longer needed.
>
> Otherwise, if 'unclass' is wanted, maybe the handling of
> empty result can be moved to be after 'unclass', to be
> consistent with non-empty result.
>
>
> If point (a) in suggestion 2 is applied, 'diff.default' can
> handle input of class "Date" and "POSIXt". If, in addition,
> point (b) in suggestion 2 is also applied, 'diff.default'
> can handle data frame as input.
>
More information about the R-devel
mailing list