Home > Enterprise >  Is there a way to vectorize this loop?
Is there a way to vectorize this loop?

Time:09-21

I'm trying to simulate the results of two different dice. One die is fair (i.e. the probability of each number is 1/6), but the other isn't.

I have a numpy array with 0's and 1's saying which die is used every time, 0 being the fair one and 1 the other. I'd like to compute another numpy array with the results. In order to do this task, I have used the following code:

def dice_simulator(dices : np.ndarray) -> np.ndarray:
  n = len(dices)
  results = np.zeros(n)
  i = 0
  for dice in np.nditer(dices):
    if dice:
      results[i] = rnd.choice(6, p = [1/12, 1/12, 1/12, 1/4, 1/4, 1/4])   1
    else:
      results[i] = rnd.choice(6)   1
    i  = 1
  return results

This takes a lot of time compared to the rest of the program, and think it is because I'm iterating over a numpy array instead of using vectorization of operations. Can anyone help me with that?

CodePudding user response:

Answers already given vectorize by over generating and throwing up some outputs, it seems wrong.

Moreover, I will generalize to any number of dices.

First, you need to be able to get a condlist: it is a list of length the number of dices, with each i-th element being a boolean array containing True where the i-th dice should be used:

dices_idxs = np.array([0, 1, 2])
dices_sequence = np.array([0, 1, 2, 2, 1, 1, 0])

condlist = np.equal(*np.broadcast_arrays(dices_sequence[None, :], dices_idxs[:, None]))

print(condlist)

# [[ True False False False False False  True]
#  [False  True False False  True  True False]
#  [False False  True  True False False False]]

Second, you can generalize the answer given by @Ahmed AEK using np.select:

def dice_simulator_select(dices_sequence, dices_weights):
    faces = np.arange(1, 7)
    num_dices = len(dices_weights)
    dices_idxs = np.arange(num_dices)
    num_throws = len(dices_sequence)

    condlist = list(
        np.equal(*np.broadcast_arrays(dices_sequence[None, :], dices_idxs[:, None]))
    )
    choicelist = [
        RNG.choice(faces, size=num_throws, p=dices_weights[dice_idx])
        for dice_idx in range(num_dices)
    ]
    return np.select(condlist, choicelist)

But it has the issue stated first as it over-generates then discards some generated values, which can be problematic considering randomness.

A more correct way is to use np.piecewise:

def dice_simulator_piecewise(dices_sequence, dices_weights):
    faces = np.arange(1, 7)
    num_dices = len(dices_weights)
    dices_idxs = np.arange(num_dices)
    num_dices = len(dices_weights)

    condlist = list(
        np.equal(*np.broadcast_arrays(dices_sequence[None, :], dices_idxs[:, None]))
    )
    # note size=len(x) ensure no more sample than needed are generated
    funclist = [
        lambda x: RNG.choice(faces, size=len(x), p=dices_weights[int(x[0])])
    ] * num_dices


    return np.piecewise(dices_sequence, condlist, funclist)

You can use the functions as follows, and see that the correct function using np.piecewise is even faster (20% faster in below case):

RNG = np.random.default_rng()

dices_weights = [
    None,  # uniform
    [1 / 12, 1 / 12, 1 / 12, 1 / 4, 1 / 4, 1 / 4],
    None,
    [1 / 4, 1 / 4, 1 / 4, 1 / 12, 1 / 12, 1 / 12],
    None,
    [1 / 12, 1 / 12, 1 / 12, 1 / 4, 1 / 4, 1 / 4],
]
num_dices = len(dices_weights)
num_throws = 1_000
dices_sequence = RNG.choice(np.arange(num_dices), size=num_throws)


%timeit dice_simulator_select(dices_sequence, dices_weights)
%timeit dice_simulator_piecewise(dices_sequence, dices_weights)

# 311 µs ± 5.94 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
# 240 µs ± 10.3 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

CodePudding user response:

this is the correct way to do it.

def dice_simulator(dices: np.array) -> np.array:
    return np.where(
        dices,
        rnd.choice(6, dices.shape, p = [1/12, 1/12, 1/12, 1/4, 1/4, 1/4]),
        rnd.choice(6, dices.shape)
    )   1

CodePudding user response:

Try this:

def dice_simulator(dices):
    p = [1 / 12, 1 / 12, 1 / 12, 1 / 4, 1 / 4, 1 / 4]
    size = dices.shape
    fair_die = np.random.choice(6, size=size)
    unfair_die = np.random.choice(6, p=p, size=size)
    return (dices == 0) * fair_die   (dices == 1) * unfair_die   1
  • Related