Im currently trying to do a monte carlo simulation, the problem is its taking quite a while to run 100,000 runs or more when Im told it shouldnt take very long.
Heres my code:
runs = 10000
import matplotlib.pyplot as plt
import random
import numpy as np
from scipy.stats import norm
from scipy.stats import uniform
import seaborn as sns
import pandas
def steadystate():
p=0.88
Cout=4700000000
LambdaAER=0.72
Vol=44.5
Depo=0.42
Uptime=0.1
Effic=0.38
Recirc=4.3
x = random.randint(86900000,2230000000000)
conc = ((p*Cout*LambdaAER) (x/Vol))/(LambdaAER Depo (Uptime*Effic*Recirc))
return conc
x = 0
while x < runs:
#results = steadystate (Faster)
results = np.array([steadystate() for _ in range(1000)])
print(results)
x =1
ax = sns.distplot(results,
bins=100,
kde=True,
color='skyblue',
hist_kws={"linewidth": 15,'alpha':1})
ax.set(xlabel='Uniform Distribution ', ylabel='Frequency')
Im fairly new at python so Im unsure of where to optimize my code. Any help or suggestions would be much appreaciated.
CodePudding user response:
You're not actually benefiting from numpy
here, because you produce each value one at a time, doing all the math for that one value, then producing the array from the results. Work with arrays from the get-go, and do all the work on all elements in bulk to derive the benefits of vectorization:
import numpy.random
def steadystate(count): # Receive desired number of values for bulk generation
p=0.88
Cout=4700000000
LambdaAER=0.72
Vol=44.5
Depo=0.42
Uptime=0.1
Effic=0.38
Recirc=4.3
x = numpy.random.randint(86900000, 2230000000000, count) # Make array of count values all at once
# Perform all the math in bulk
conc = ((p*Cout*LambdaAER) (x/Vol))/(LambdaAER Depo (Uptime*Effic*Recirc))
return conc
x = 0
while x < runs:
results = steadystate(1000) # Just call with number of desired items
print(results)
x =1
Note that this code matches your original code by replacing results
each time, rather than accumulating results. I'm not clear on what you what to do instead, so this is just doing the (probably) wrong thing much faster.
CodePudding user response:
About 70% of the time you are losing is with the creation of the random numbers. The question is whether you need each time random numbers? Would it be sufficient may be to generate the random matrix just once and reuse it.
However, the code is pretty quick isn't it. Except the drawing part this par took for one iteration just 1.2 ms.
%timeit results = np.array([steadystate() for _ in range(1000)])
1.24 ms ± 3.35 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)