[R] permutations of a binary matrix with fixed margins
Charles C. Berry
cberry at tajo.ucsd.edu
Tue Oct 2 19:05:42 CEST 2007
On Tue, 2 Oct 2007, Paul Johnson wrote:
> Jérôme,
>
> As a first attempt, how about the function below. It works (or not) by
> randomly sorting the rows and columns, then searching the table for
> "squares" with the corners = matrix(c(1,0,0,1),ncol=2) and subtracting them
> from 1 to give matrix(c(0,1,1,0),ncol=2) (and vice versa). Randomized
> matrices can be produced as a chain where each permutation is seeded with
> the previous one.
>
> Potential problems:
> 1. Does it really explore all possible permutations?
Yes. If you randomize. Not sure about 'only if you randomize'.
> Does it do it in an unbiased way?
No. If you mean are all choices equiprobable. So, you either need to add
importance weights or do a rejection step.
Zaman and Simberloff (2002, Environmental and Ecological Statistics 9,
405--421) go through this and propose methods for the problem.
They give an example of a binary matrix with margins (2, 1, 1) and (1, 2,
1). There are five matrices that satisfy the constraints that the
margins are as specified and the cells are either zero or one. Study of
that simple setup is revealing of the problems encountered in developing
sampling schemes for the general problem.
I recall they have a C program for the sampler they propose. Probably the
path of least resistance is to get hold of that.
> 2. Related to above: there is potential autocorrelation (although I haven't
> found it with my data set), so might need some dememorisation steps.
> 3. It's slow and dememorisation makes it slower.
> 4. It isn't clear to me whether it needs the added stochastic element, i.e.
> squares are only flipped if "runif(1)<0.5". In practice it seems to work
> without it (as far as I can tell, i.e. it isn't autocorrelated using my data
> set).
>
> I suspect there's a much better way of doing this, which might take the
> margins as an input, and therefore wouldn't be directly derived from any
> particular matrix structure.
>
I thought that, too. You can see my _retraction_ here:
http://finzi.psych.upenn.edu/R/Rhelp02a/archive/98897.html
The difficulty is that under the constraints and the relevant null
hypothesis all matrices are equiprobable. It is easy to generate random
matrices satisfying all constraints, but not the null.
Brute force combinatorics is not attractive either. The number of binary
matrices can be very large for problems of practical size. As an example,
if you have a 20 by 20 binary table with two margins of rep(10,20), there
are many more than 2^100 possible tables. There is a straightforward way
of enumerating them, but it is not computationally feasible.
HTH,
Chuck
> Paul
>
> ###############################################################
>
> # function to permute binary matrix keeping margins fixed.
> # the input "x" is the binary matrix to be permuted
>
> permute.struct<-
> function(x)
> {
> x<-x[rownames(x)[sample(1:nrow(x))],colnames(x)[sample(1:ncol(x))]]
> pattern<-c(1,0,0,1)
> for(i in 1:(nrow(x)-1))
> for(j in 1:(ncol(x)-1))
> for(ii in (i+1):nrow(x))
> for(jj in (j+1):ncol(x))
> {
> query<-c(x[c(i,ii),c(j,jj)])
> if((all(query-pattern==0) || all(query==1-pattern)) &&
> runif(1)<0.5)
> x[c(i,ii),c(j,jj)] <- 1 - x[c(i,ii),c(j,jj)]
> }
> x
> }
>
> ###############################################################
>
> --------------------------------------------------------
> From: Mathieu Jérôme <jerome.mathieu_at_snv.jussieu.fr>
> Date: Thu 05 Apr 2007 - 13:34:01 GMT
>
>
> Dear R Users,
> How can I randomize a matrix (with contains only 0 and 1) with the
> constraint that margin totals (columns and rows sums) are kept constant?
> Some have called this "swap permutation" or something like that.
>
> The principle is to find bloc of
> 10
> 01
> and change it for
> 01
> 10
> there can be several rows or columns between these numbers. It probably
> exists in R, but I can't find the function. I am aware of permcont, but it
> works only for 2x2 matrices
>
> thanks in advance
> Jerome
>
> --
> Jérôme Mathieu
> Laboratory of soil tropical ecology
> UPMC
> jerome.mathieu at snv.jussieu.fr
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
Charles C. Berry (858) 534-2098
Dept of Family/Preventive Medicine
E mailto:cberry at tajo.ucsd.edu UC San Diego
http://famprevmed.ucsd.edu/faculty/cberry/ La Jolla, San Diego 92093-0901
More information about the R-help
mailing list