[R] Unexpected behavior of "apply" when FUN=sample
(Ted Harding)
Ted.Harding at wlandres.net
Tue May 14 12:07:50 CEST 2013
On 14-May-2013 09:46:32 Duncan Murdoch wrote:
> On 13-05-14 4:52 AM, Luca Nanetti wrote:
>> Dear experts,
>>
>> I wanted to signal a peculiar, unexpected behaviour of 'apply'.
>> It is not a bug, it is per spec, but it is so counterintuitive
>> that I thought it could be interesting.
>>
>> I have an array, let's say "test", dim=c(7,5).
>>
>>> test <- array(1:35, dim=c(7, 5))
>>> test
>>
>> [,1] [,2] [,3] [,4] [,5]
>> [1,] 1 8 15 22 29
>> [2,] 2 9 16 23 30
>> [3,] 3 10 17 24 31
>> [4,] 4 11 18 25 32
>> [5,] 5 12 19 26 33
>> [6,] 6 13 20 27 34
>> [7,] 7 14 21 28 35
>>
>> I want a new array where the content of the rows (columns) are
>> permuted, differently per row (per column)
>>
>> Let's start with the columns, i.e. the second MARGIN of the array:
>>> test.m2 <- apply(test, 2, sample)
>>> test.m2
>>
>> [,1] [,2] [,3] [,4] [,5]
>> [1,] 1 10 18 23 32
>> [2,] 7 9 16 25 30
>> [3,] 6 14 17 22 33
>> [4,] 4 11 15 24 34
>> [5,] 2 12 21 28 31
>> [6,] 5 8 20 26 29
>> [7,] 3 13 19 27 35
>>
>> perfect. That was exactly what I wanted: the content of each column is
>> shuffled, and differently for each column.
>> However, if I use the same with the rows (MARGIIN = 1), the output is
>> transposed!
>>
>>> test.m1 <- apply(test, 1, sample)
>>> test.m1
>>
>> [,1] [,2] [,3] [,4] [,5] [,6] [,7]
>> [1,] 1 2 3 4 5 13 21
>> [2,] 22 30 17 18 19 20 35
>> [3,] 15 23 24 32 26 27 14
>> [4,] 29 16 31 25 33 34 28
>> [5,] 8 9 10 11 12 6 7
>>
>> In other words, I wanted to permute the content of the rows of "test", and
>> I expected to see in the output, well, the shuffled rows as rows, not as
>> column!
>>
>> I would respectfully suggest to make this behavior more explicit in the
>> documentation.
>
> It's is already very explicit: "If each call to FUN returns a vector of
> length n, then apply returns an array of dimension c(n, dim(X)[MARGIN])
> if n > 1." In your first case, sample is applied to columns, and
> returns length 7 results, so the shape of the final result is c(7, 5).
> In the second case it is applied to rows, and returns length 5 results,
> so the shape is c(5, 7).
>
> Duncan Murdoch
And the (quite simple) practical implication of what Duncan points out is:
test <- array(1:35, dim=c(7, 5))
test
# [,1] [,2] [,3] [,4] [,5]
# [1,] 1 8 15 22 29
# [2,] 2 9 16 23 30
# [3,] 3 10 17 24 31
# [4,] 4 11 18 25 32
# [5,] 5 12 19 26 33
# [6,] 6 13 20 27 34
# [7,] 7 14 21 28 35
# To permute the rows:
t(apply(t(test), 2, sample))
# [,1] [,2] [,3] [,4] [,5]
# [1,] 22 29 8 15 1
# [2,] 30 16 23 2 9
# [3,] 10 31 24 3 17
# [4,] 11 4 25 32 18
# [5,] 26 5 12 33 19
# [6,] 27 34 20 13 6
# [7,] 35 28 14 7 21
which looks right!
Ted.
-------------------------------------------------
E-Mail: (Ted Harding) <Ted.Harding at wlandres.net>
Date: 14-May-2013 Time: 11:07:46
This message was sent by XFMail
More information about the R-help
mailing list