how to reduce the size of a binary matrix and preserve all the ones in python?-CodePudding

I have a 648 * 2340 matrix which contains ones and zeros but mostly zeros. i would like to reduce the matrix to 216 * 780, which is 9 times smaller in terms of matrix elements. That being said I need to divide the big matrix into many 3 * 3 matrices which eventually collapse into one element. The value of the element should be one if there exists one in the 3 * 3 matrices and 0 otherwise. What are the approaches to this? Thanks.

CodePudding user response：

Use sparse matrix representation. It is representation of matrix where only entries containing non null values (1 in your case) are stored.

sparseMatrix = [[0,0,1,0,1],[0,0,1,1,0],[0,0,0,0,0],[0,1,1,0,0]]
 
# initialize size as 0
size = 0
 
for i in range(4):
    for j in range(5):
        if (sparseMatrix[i][j] != 0):
            size  = 1
 
# number of columns in compactMatrix(size) should
# be equal to number of non-zero elements in sparseMatrix
rows, cols = (3, size)
compactMatrix = [[0 for i in range(cols)] for j in range(rows)]
 
k = 0
for i in range(4):
    for j in range(5):
        if (sparseMatrix[i][j] != 0):
            compactMatrix[0][k] = i
            compactMatrix[1][k] = j
            compactMatrix[2][k] = sparseMatrix[i][j]
            k  = 1
 
for i in compactMatrix:
    print(i)

For more info : https://en.wikipedia.org/wiki/Sparse_matrix

CodePudding user response：

Can be done this way:

import numpy as np
np.random.seed(42)
n, m, a, b = 12, 12, 3, 3 # n, m are dims of input matrix and a, b of output matrix
a1, a2 = np.random.choice((0, 1), size=(n, m), replace=True, p=(.9, .1)), np.zeros((int(n/a), int(m/b)), dtype=int)
for i, x in enumerate(np.linspace(0, n, int(n/a 1), endpoint=True, dtype=int, axis=0)[:-1]):
    for j, y in enumerate(np.linspace(0, m, int(m/b 1), endpoint=True, dtype=int, axis=0)[:-1]):
        if a1[x:x a, y:y b].sum() > 0: a2[i, j] = 1

Generated matrix a1:

array([[0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 1, 0],
       [0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0],
       [0, 0, 1, 0, 1, 0, 0, 1, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0],
       [0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
       [0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 0, 0]])

Output matrix a2:

array([[1, 0, 0, 1],
       [1, 1, 1, 1],
       [0, 0, 1, 0],
       [1, 1, 1, 0]])