Python code has a big bottleneck, but I am not experienced enough to see where it is-CodePudding

My code is supposed to model the average energy for alpha decay, it works but it is very slow.

import numpy as np
from numpy import sin, cos, arccos, pi, arange, fromiter
import matplotlib.pyplot as plt
from random import choices

r_cell, d, r, R, N = 5.5, 15.8, 7.9, 20, arange(1,10000, 50)


def total_decay(N):
    theta = 2*pi*np.random.rand(2,N)
    phi = arccos(2*np.random.rand(2,N)-1)

    x = fromiter((r*sin(phi[0][i])*cos(theta[0][i]) for i in range(N)),float, count=-1)
    dx = fromiter((x[i]   R*sin(phi[1][i])*cos(theta[1][i]) for i in range(N)), float,count=-1)
    y = fromiter((r*sin(phi[0][i])*sin(theta[0][i]) for i in range(N)),float, count=-1)
    dy = fromiter((y[i]   R*sin(phi[1][i])*sin(theta[1][i]) for i in range(N)),float,count=-1)
    z = fromiter((r*cos(phi[0][i]) for i in range(N)),float, count=-1)
    dz = fromiter((z[i]   R*cos(phi[1][i]) for i in range(N)),float, count=-1)

    return x, y, z, dx, dy, dz


def inter(x,y,z,dx,dy,dz, N):
    intersections = 0 

    for i in range(N): #Checks to see if a line between two points intersects with the target cell
        a = (dx[i] - x[i])*(dx[i] - x[i])   (dy[i] - y[i])*(dy[i] - y[i])   (dz[i] - z[i])*(dz[i] - z[i])
        b = 2*((dx[i] - x[i])*(x[i]-d)   (dy[i] - y[i])*(y[i]) (dz[i] - z[i])*(z[i]))
        c = d*d   x[i]*x[i]   y[i]*y[i]   z[i]*z[i] - 2*(d*x[i]) - r_cell*r_cell
        if b*b - 4*a*c >= 0:
            intersections  = 1
    return intersections

def hits(N):
    I = []
    for i in range(len(N)):
        decay = total_decay(N[i])
        I.append(inter(decay[0],decay[1],decay[2],decay[3],decay[4],decay[5],N[i]))
    return I

def AE(I,N): 
    p1, p2 = 52.4 / (52.4   18.9), 18.9 / (52.4   18.9)
    E = [choices([5829.6, 5793.1], cum_weights=(p1,p2),k=1)[0] for _ in range(I)]
    return sum(E)/N

def list_AE(I,N):
    E = [AE(I[i],N[i]) for i in range(len(N))]
    return E


plt.plot(N, list_AE(hits(N),N))
plt.title('Average energy per dose with respect to number of decays')
plt.xlabel('Number of decays [N]')
plt.ylabel('Average energy [keV]')
plt.show()

Can anyone experienced point out where the bottleneck takes place, explain why it happens and how to optimize it? Thanks in advance.

CodePudding user response：

To find out where most of the time is spent in your code, examine it with a profiler. By wrapping your main code like this:

import cProfile
import pstats
profiler = cProfile.Profile()
profiler.enable()

result = list_AE(hits(N), N)

profiler.disable()
stats = pstats.Stats(profiler).sort_stats('tottime')
stats.print_stats()

You will get the following overview (abbreviated):

         6467670 function calls in 19.982 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      200    4.766    0.024    4.766    0.024 ./alphadecay.py:24(inter)
   995400    2.980    0.000    2.980    0.000 ./alphadecay.py:17(<genexpr>)
   995400    2.925    0.000    2.925    0.000 ./alphadecay.py:15(<genexpr>)
   995400    2.690    0.000    2.690    0.000 ./alphadecay.py:16(<genexpr>)
   995400    2.683    0.000    2.683    0.000 ./alphadecay.py:14(<genexpr>)
   995400    1.674    0.000    1.674    0.000 ./alphadecay.py:19(<genexpr>)
   995400    1.404    0.000    1.404    0.000 ./alphadecay.py:18(<genexpr>)
     1200    0.550    0.000   14.907    0.012 {built-in method numpy.fromiter}

Most of the time is spent in the inter function since it runs a huge loop over N. To improve this, you could parallelize its executing to multiple threads using multiprocessing.Pool.

CodePudding user response：

I won't tell you where the bottleneck is, but I can tell you how to find bottlenecks in complex programs. The keyword is profiling. A profiler is an application that will run alongside your code and measure the execution times of each statement. Search online for python profiler.

The poor person's version would be debugging and guesstimating the execution times of statements or using print statements or a library for measuring execution times. Using a profiler is an important skill that's not that difficult to learn, though.

CodePudding user response：

You should avoid appending as much as you can (you used it in hits) and use list comprehensions or already built lists instead (as you used in list_AE). I suggest you to built a list (with needed length), then just fill each cell by its index.