Finding a pattern in a binary matrix with R-CodePudding

I have a nxn symetrical binary matrix and I want to find the largest rectangle (area) with 0 at the top-left and bottom-right corners and 1 at the top-right and bottom-left corner. If I just do it with loops, checking all the rectangles from the biggest to the smallest it takes "days" for n=100. Does anyone have an idea to do it efficiently?

Thanks a lot !

CodePudding user response：

thanks for your answers. Matrices I use are adjacency matrices of random Erdos-Renyi graphs. But one can take any random symetrical binary matrix to test it. Until now, I use 4 nested loops :

switch<-function(Mat)
{
n=nrow(Mat) 
for (i in 1:(n-1)) { 
    for(j in seq(n,i 1,by=-1)) {
        for(k in 1:(n-1)) { 
            if ((k==i)||(k==j) || (Mat[i,k]==1)||(Mat[j,k]==0)) next 
            for(l in seq(n,k 1,by=-1)) { 
                if ((l==i)||(l==j)|| (Mat[i,l]==0)||(Mat[j,l]==1)) next 
                return(i,j,k,l)
            }
        }
    }
}

CodePudding user response：

Here's an approach that you can try for now. It doesn't require symmetry, and it treats all nonzero elements like ones for efficiency.

It loops over the ones, assuming that there are fewer ones than zeros. It can easily be modified to handle the reverse case in which there are actually more ones than zeros.

It isn't optimal, since it loops over all of the ones even if the largest box is identified early. But it is still fast for n = 100, requiring less than half of a second on my machine even when ones and zeros occur in roughly equal proportion:

f <- function(X) {
    if (!is.logical(X)) {
        storage.mode(X) <- "logical"
    }
    J <- which(X, arr.ind = TRUE, useNames = FALSE)
    ii <- J[, 1L]
    jj <- J[, 2L]
    nmax <- 0L
    res <- NULL
    for (k in seq_along(ii)) {
        i0 <- ii[k]
        j0 <- jj[k]
        ok <- ii < i0 & jj > j0
        if (any(ok)) {
            i1 <- ii[ok]
            j1 <- jj[ok]
            ok <- !(X[i0, j1] | X[i1, j0])
            if (any(ok)) {
                i1 <- i1[ok]
                j1 <- j1[ok]
                n <- (i0 - i1   1L) * (j1 - j0   1L)
                w <- which.max(n)
                if (n[w] > nmax) {
                    nmax <- n[w]
                    res <- c(i0 = i0, j0 = j0, i1 = i1[w], j1 = j1[w])
                }
            }
        }
    }
    res
}

mkX <- function(n) {
    X <- matrix(sample(0:1, n * n, TRUE), n, n)
    X[upper.tri(X)] <- t(X)[upper.tri(X)]
    X
}

set.seed(1L)
X <- mkX(6L)
X
##      [,1] [,2] [,3] [,4] [,5] [,6]
## [1,]    0    1    0    0    1    0
## [2,]    1    0    1    1    0    0
## [3,]    0    1    0    1    1    1
## [4,]    0    1    1    0    0    0
## [5,]    1    0    1    0    0    1
## [6,]    0    0    1    0    1    0

f(X)
## i0 j0 i1 j1 
##  5  1  1  5

Y <- mkX(100L)
microbenchmark::microbenchmark(f(Y))
## Unit: milliseconds
##  expr     min       lq     mean   median       uq      max neval
##  f(Y) 310.139 318.3363 327.8116 321.4109 326.5088 391.9081   100