I'm trying to simulate the results of two different dice. One die is fair (i.e. the probability of each number is 1/6), but the other isn't.
I have a numpy array with 0's and 1's saying which die is used every time, 0 being the fair one and 1 the other. I'd like to compute another numpy array with the results. In order to do this task, I have used the following code:
def dice_simulator(dices : np.ndarray) -> np.ndarray:
n = len(dices)
results = np.zeros(n)
i = 0
for dice in np.nditer(dices):
if dice:
results[i] = rnd.choice(6, p = [1/12, 1/12, 1/12, 1/4, 1/4, 1/4]) 1
else:
results[i] = rnd.choice(6) 1
i = 1
return results
This takes a lot of time compared to the rest of the program, and think it is because I'm iterating over a numpy array instead of using vectorization of operations. Can anyone help me with that?
CodePudding user response:
Answers already given vectorize by over generating and throwing up some outputs, it seems wrong.
Moreover, I will generalize to any number of dices.
First, you need to be able to get a condlist
: it is a list of length the number of dices, with each i-th element being a boolean array containing True
where the i-th dice should be used:
dices_idxs = np.array([0, 1, 2])
dices_sequence = np.array([0, 1, 2, 2, 1, 1, 0])
condlist = np.equal(*np.broadcast_arrays(dices_sequence[None, :], dices_idxs[:, None]))
print(condlist)
# [[ True False False False False False True]
# [False True False False True True False]
# [False False True True False False False]]
Second, you can generalize the answer given by @Ahmed AEK using np.select
:
def dice_simulator_select(dices_sequence, dices_weights):
faces = np.arange(1, 7)
num_dices = len(dices_weights)
dices_idxs = np.arange(num_dices)
num_throws = len(dices_sequence)
condlist = list(
np.equal(*np.broadcast_arrays(dices_sequence[None, :], dices_idxs[:, None]))
)
choicelist = [
RNG.choice(faces, size=num_throws, p=dices_weights[dice_idx])
for dice_idx in range(num_dices)
]
return np.select(condlist, choicelist)
But it has the issue stated first as it over-generates then discards some generated values, which can be problematic considering randomness.
A more correct way is to use np.piecewise
:
def dice_simulator_piecewise(dices_sequence, dices_weights):
faces = np.arange(1, 7)
num_dices = len(dices_weights)
dices_idxs = np.arange(num_dices)
num_dices = len(dices_weights)
condlist = list(
np.equal(*np.broadcast_arrays(dices_sequence[None, :], dices_idxs[:, None]))
)
# note size=len(x) ensure no more sample than needed are generated
funclist = [
lambda x: RNG.choice(faces, size=len(x), p=dices_weights[int(x[0])])
] * num_dices
return np.piecewise(dices_sequence, condlist, funclist)
You can use the functions as follows, and see that the correct function using np.piecewise
is even faster (20% faster in below case):
RNG = np.random.default_rng()
dices_weights = [
None, # uniform
[1 / 12, 1 / 12, 1 / 12, 1 / 4, 1 / 4, 1 / 4],
None,
[1 / 4, 1 / 4, 1 / 4, 1 / 12, 1 / 12, 1 / 12],
None,
[1 / 12, 1 / 12, 1 / 12, 1 / 4, 1 / 4, 1 / 4],
]
num_dices = len(dices_weights)
num_throws = 1_000
dices_sequence = RNG.choice(np.arange(num_dices), size=num_throws)
%timeit dice_simulator_select(dices_sequence, dices_weights)
%timeit dice_simulator_piecewise(dices_sequence, dices_weights)
# 311 µs ± 5.94 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
# 240 µs ± 10.3 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
CodePudding user response:
this is the correct way to do it.
def dice_simulator(dices: np.array) -> np.array:
return np.where(
dices,
rnd.choice(6, dices.shape, p = [1/12, 1/12, 1/12, 1/4, 1/4, 1/4]),
rnd.choice(6, dices.shape)
) 1
CodePudding user response:
Try this:
def dice_simulator(dices):
p = [1 / 12, 1 / 12, 1 / 12, 1 / 4, 1 / 4, 1 / 4]
size = dices.shape
fair_die = np.random.choice(6, size=size)
unfair_die = np.random.choice(6, p=p, size=size)
return (dices == 0) * fair_die (dices == 1) * unfair_die 1