Home > database >  Random position from a boolean array
Random position from a boolean array

Time:07-05

I have a 2D boolean array, and I want to generate one random location where it's 1.

array([[0, 0, 0, 1, 1, 1, 0, 0],
       [0, 0, 0, 1, 1, 1, 1, 1],
       [0, 0, 1, 1, 1, 1, 1, 1],
       [0, 1, 1, 1, 0, 0, 0, 1],
       [0, 1, 1, 1, 0, 0, 1, 1]])

Example output:

(2,2)

CodePudding user response:

Something like this may work:

import numpy as np
l = np.array([[0, 0, 0, 1, 1, 1, 0, 0],
       [0, 0, 0, 1, 1, 1, 1, 1],
       [0, 0, 1, 1, 1, 1, 1, 1],
       [0, 1, 1, 1, 0, 0, 0, 1],
       [0, 1, 1, 1, 0, 0, 1, 1]])
random = np.random.choice(np.sum(l))
np.argwhere(l)[random]

result:

[0, 4]

CodePudding user response:

You can use a while loop until the random.choice() function generates an index that references to a 1 value inside 2D list:

import random
import numpy as np

l = np.array([[0, 0, 0, 1, 1, 1, 0, 0],
       [0, 0, 0, 1, 1, 1, 1, 1],
       [0, 0, 1, 1, 1, 1, 1, 1],
       [0, 1, 1, 1, 0, 0, 0, 1],
       [0, 1, 1, 1, 0, 0, 1, 1]])

i, j = random.choice(range(len(l))), random.choice(range(len(l[0])))
while l[i][j] != 1:
  i, j = random.choice(range(len(l))), random.choice(range(len(l[0])))

print((i, j), l[i][j])

(4, 6) 1

CodePudding user response:

OK first of all let's make the boolean array a boolean array:

bool_arr = np.array([[0, 0, 0, 1, 1, 1, 0, 0],
       [0, 0, 0, 1, 1, 1, 1, 1],
       [0, 0, 1, 1, 1, 1, 1, 1],
       [0, 1, 1, 1, 0, 0, 0, 1],
       [0, 1, 1, 1, 0, 0, 1, 1]])

bool_arr = (bool_arr==1)

Next I'll proceed by creating an indices matrix of the same shape of the boolean array:

indices = np.moveaxis(np.c_[np.meshgrid(np.arange(bool_arr.shape[1]), np.arange(bool_arr.shape[0]))][::-1], 0,-1)

this matrix has the property that:

indices[x, y] = [x, y] #for each valid x, y

in this way I can mask the indices matrix with the boolean array retrieving all the True positions:

indices[bool_arr]

out:

array([[0, 3],
       [0, 4],
       [0, 5],
       [1, 3],
       [1, 4],
       [1, 5],
       [1, 6],
       [1, 7],
       [2, 2],
       [2, 3],
       [2, 4],
       [2, 5],
       [2, 6],
       [2, 7],
       [3, 1],
       [3, 2],
       [3, 3],
       [3, 7],
       [4, 1],
       [4, 2],
       [4, 3],
       [4, 6],
       [4, 7]])

Now I can finally sample from this list of indices:

indices[bool_arr][np.random.randint(bool_arr.sum())]

CodePudding user response:

Here are some statistics I got for 3 methods x dense/sparse arrays. For stability, I would stick with cumSum method.

def rejection_randPos(l):
    r, c = l.shape
    i, j = random.randint(0, r - 1), random.randint(0, c - 1)
    while not l[i, j]:
        i, j = random.randint(0, r - 1), random.randint(0, c - 1)
    return i, j

def cumSum_randPos(l):
    c = l.shape[1]
    target = random.randint(1, l.sum())
    idx = np.argmax(l.cumsum() >= target)
    return idx // c, idx % c

def argWhere_randPos(l):
    target = random.randint(0, l.sum() - 1)
    return np.argwhere(l)[target]


A = np.random.rand(5, 8) < 0.8
%timeit rejection_randPos(A)
%timeit cumSum_randPos(A)
%timeit argWhere_randPos(A)
B = np.random.rand(5, 8) < 0.05
%timeit rejection_randPos(B)
%timeit cumSum_randPos(B)
%timeit argWhere_randPos(B)

=>

1.92 µs ± 28.9 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
9.81 µs ± 182 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
12.4 µs ± 148 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
28 µs ± 311 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
10.1 µs ± 753 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
12.3 µs ± 264 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
  • Related