I am using the following code unchanged in form but changed in content:
import numpy as np
import matplotlib.pyplot as plt
import random
from random import seed
from random import randint
import math
from math import *
from random import *
import statistics
from statistics import *
n=1000
T_plot=[0];
X_relm=[0];
class Objs:
def __init__(self, xIn, yIn, color):
self.xIn= xIn
self.yIn = yIn
self.color = color
def yfT(self, t):
return self.yIn*t self.yIn*t
def xfT(self, t):
return self.xIn*t-self.yIn*t
xi=np.random.uniform(0,1,n);
yi=np.random.uniform(0,1,n);
O1 = [Objs(xIn = i, yIn = j, color = choice(["Black", "White"])) for i,j
in zip(xi,yi)]
X=sorted(O1,key=lambda x:x.xIn)
dt=1/(2*n)
T=20
iter=40000
Black=[]
White=[]
Xrelm=[]
for i in range(1,iter 1):
t=i*dt
for j in range(n-1):
check=X[j].xfT(t)-X[j 1].xfT(t);
if check<0:
X[j],X[j 1]=X[j 1],X[j]
if check<-10:
X[j].color,X[j 1].color=X[j 1].color,X[j].color
if X[j].color=="Black":
Black.append(X[j].xfT(t))
else:
White.append(X[j].xfT(t))
Xrel=mean(Black)-mean(White)
Xrelm.append(Xrel)
plot1=plt.figure(1);
plt.plot(T_plot,Xrelm);
plt.xlabel("time")
plt.ylabel("Relative ")
and it keeps running (I left it for 10 hours) without giving output for some parameters simply because it's too big I guess. I know that my code is not faulty totally (in the sense that it should give something even if wrong) because it does give outputs for fewer time steps and other parameters.
So, I am focusing on trying to optimize my code so that it takes lesser time to run. Now, this is a routine task for coders but I am a newbie and I am coding simply because the simulation will help in my field. So, in general, any inputs of a general nature that give insights on how to make one's code faster are appreciated.
Besides that, I want to ask whether defining a function a priori for the inner loop will save any time.
I do not think it should save any time since I am doing the same thing but I am not sure maybe it does. If it doesn't, any insights on how to deal with nested loops in a more efficient way along with those of general nature are appreciated.
(I have tried to shorten the code as far as I could and still not miss relevant information)
CodePudding user response:
There are several issues in your code:
- the mean is recomputed from scratch based on the growing array. Thus, the complexity of
mean(Black)-mean(White)
is quadratic to the number of elements. - The
mean
function is not efficient. Using a basicsum
and division is much faster. In fact, a manual mean is about 25~30 times faster on my machine. - The CPython interpreter is very slow so you should avoid using loops as much as possible (OOP code does not help either). If this is not possible and your computation is expensive, then consider using a natively compiled code. You can use tools like PyPy, Numba or Cython or possibly rewrite a part in C.
- Note that strings are generally quite slow and there is no reason to use them here. Consider using enumerations instead (ie. integers).
Here is a code fixing the first two points:
dt = 1/(2*n)
T = 20
iter = 40000
Black = []
White = []
Xrelm = []
cur1, cur2 = 0, 0
sum1, sum2 = 0.0, 0.0
for i in range(1,iter 1):
t = i*dt
for j in range(n-1):
check = X[j].xfT(t) - X[j 1].xfT(t)
if check < 0:
X[j],X[j 1] = X[j 1],X[j]
if check < -10:
X[j].color, X[j 1].color = X[j 1].color, X[j].color
if X[j].color == "Black":
Black.append(X[j].xfT(t))
else:
White.append(X[j].xfT(t))
delta1, delta2 = sum(Black[cur1:]), sum(White[cur2:])
sum1, sum2 = sum1 delta1, sum2 delta2
cur1, cur2 = len(Black), len(White)
Xrel = sum1/cur1 - sum2/cur2
Xrelm.append(Xrel)
Consider resetting Black
and White
to an empty list if you do not use them later.
This is several hundreds of time faster. It now takes 2 minutes as opposed to >20h (estimation) for the initial code.
Note that using a compiled code should be at least 10 times faster here so the execution time should be no more than dozens of seconds.
CodePudding user response:
As mentioned in earlier comments, this one is a bit too broad to answer.
To illustrate; your iteration itself doesn't take very long:
import time
start = time.time()
for i in range(10000):
for j in range(10000):
pass
end = time.time()
print (end-start)
On my not-so-great machine that takes ~2s to complete.
So the looping portion is only a tiny fraction of your 10h run time.
The detail of what you're doing in the loop is the key.
Whilst very basic, the approach I've shown in the code above could be applied to your existing code to work out which bit(s) are the least performant and then raise a new question with some more specific, actionable detail.