I want to generate data using random numbers and then generate random samples with replacement using the generated data. The problem is that using random.seed(10)
only fixes the initial random numbers for the generated data but it does not fix the random samples generated inside the loop, everytime I run the code I get the same generated data but different random samples and I would like to get the same random samples in order to get reproducible results. The code is the following:
import numpy as np
import random
np.random.seed(10)
data = list(np.random.binomial(size = 215 , n=1, p= 0.3))
sample_mean = []
for i in range(1000):
sample = random.choices(data, k=215)
mean = np.mean(sample)
sample_mean.append(mean)
print(np.mean(sample_mean))
np.mean(sample_mean)
should retrieve the same value every time the code is ran but it does not happen.
I tried typing random.seed(i) inside the loop but it didn't work.
CodePudding user response:
your random.choices(data, k=215)
is from python builtin random
library which has a different seed than the one inside numpy.random
, so seeding numpy isn't enough.
the correct solution here is to use numpy np.random.choice
here as you are already using numpy.
import numpy as np
np.random.seed(10)
data = np.random.binomial(size=215, n=1, p=0.3)
sample_mean = []
for i in range(1000):
sample = np.random.choice(data,size=215)
mean = np.mean(sample)
sample_mean.append(mean)
print(np.mean(sample_mean))
ps: calling list
on data
is not necessary, and will slow your code down.
CodePudding user response:
Fixing the seed for np.random
doesn't fix the seed for random
...
So adding a simple line for fixing both seeds will give you reproducible results:
import numpy as np
import random
np.random.seed(10)
random.seed(10)
data = list(np.random.binomial(size=215, n=1, p=0.3))
sample_mean = []
for i in range(1000):
sample = random.choices(data, k=215)
mean = np.mean(sample)
sample_mean.append(mean)
print(np.mean(sample_mean))
Or, alternatively, you can use np.random.choices
instead of random.choices
.