Home > Software engineering >  Speeding up Sympy's solveset calculation for a large array of variables
Speeding up Sympy's solveset calculation for a large array of variables

Time:04-02

I'm trying to create a parametrization of points in space from a specific point according to a specific inequality.

I'm doing it using Sympy.solevset method while the calculation will return an interval of the parameter t that represents all points between those in my dataframe.

Sadly, performing a Sympy.solveset over 13 sets of values (i.e 13 iterations) leads to execution times of over 20 seconds overall, and over 1 sec calculation time per set.

The code:

from sympy import *
from sympy import S
from sympy.solvers.solveset import solveset, solveset_real
import pandas as pd
import time

t=symbols('t',positive=True)
p1x,p1y,p2x,p2y=symbols('p1x p1y p2x p2y')

centerp=[10,10]
radius=5

data={'P1X':[0,1,2,3,1,2,3,1,2,3,1,2,3],'P1Y':[3,2,1,0,1,2,3,1,2,3,1,2,3],'P2X':[3,8,2,4,1,2,3,1,2,3,1,2,3],'P2Y':[3,9,10,7,1,2,3,1,2,3,1,2,3],'result':[0,0,0,0,0,0,0,0,0,0,0,0,0]}
df=pd.DataFrame(data)

parameterized_x=p1x t*(p2x-p1x)
parameterized_y=p1y t*(p2y-p1y)

start_whole_process=time.time()

overall_time=0

for index,row in df.iterrows():
    
   parameterized_x.subs([[p1x,row['P1X']],[p2x,row['P2X']]])
   parameterized_y.subs([[p1y,row['P1Y']],[p2y,row['P2Y']]])
   expr=sqrt((parameterized_x-centerp[0])**2 (parameterized_y-centerp[1])**2)-radius
   
   start=time.time()

   df.at[index,'result']=solveset(expr>=0,t,domain=S.Reals)
   end=time.time()
   
   overall_time=overall_time end-start
   
end_whole_process=time.time()

I need to know if there's a way to enhance calculation time or maybe there is another package that can preform a specific inequality over large quantities of data without having to wait minutes upon minutes.

CodePudding user response:

There is one big mistake in your current approach than needs to be fixed first. Inside your for loop you did:

parameterized_x.subs([[p1x,row['P1X']],[p2x,row['P2X']]])
parameterized_y.subs([[p1y,row['P1Y']],[p2y,row['P2Y']]])
expr=sqrt((parameterized_x-centerp[0])**2 (parameterized_y-centerp[1])**2)-radius

This is wrong: SymPy expressions cannot be modified in place. This leads your expr to be exactly the same for each row, namely:

# sqrt((p1x   t*(-p1x   p2x) - 10)**2   (p1y   t*(-p1y   p2y) - 10)**2) - 5

Then, solveset tries to solve the same expression on each row. Because this expression contains 3 symbols, solveset takes a long time trying to compute the solution, eventually producing the same answer for each row:

# ConditionSet(t, sqrt((p1x   t*(-p1x   p2x) - 10)**2   (p1y   t*(-p1y   p2y) - 10)**2) - 5 >= 0, Complexes)

Remember: every operation you apply to a SymPy expression creates a new SymPy expression. So, the above code has to be modified to:

px_expr = parameterized_x.subs([[p1x,row['P1X']],[p2x,row['P2X']]])
py_expr = parameterized_y.subs([[p1y,row['P1Y']],[p2y,row['P2Y']]])
expr=sqrt((px_expr-centerp[0])**2 (py_expr-centerp[1])**2)-radius

In doing so, expr is different for each row, as it is expected. Then, solveset computes different solutions, and it is much much faster.

Here is your full example:

from sympy import *
from sympy.solvers.solveset import solveset, solveset_real
import pandas as pd
import time

t=symbols('t',positive=True)
p1x,p1y,p2x,p2y=symbols('p1x p1y p2x p2y')

centerp=[10,10]
radius=5

data={'P1X':[0,1,2,3,1,2,3,1,2,3,1,2,3],'P1Y':[3,2,1,0,1,2,3,1,2,3,1,2,3],'P2X':[3,8,2,4,1,2,3,1,2,3,1,2,3],'P2Y':[3,9,10,7,1,2,3,1,2,3,1,2,3],'result':[0,0,0,0,0,0,0,0,0,0,0,0,0]}
df=pd.DataFrame(data)

parameterized_x=p1x t*(p2x-p1x)
parameterized_y=p1y t*(p2y-p1y)

start_whole_process=time.time()

overall_time=0

for index,row in df.iterrows():
    px_expr = parameterized_x.subs([[p1x,row['P1X']],[p2x,row['P2X']]])
    py_expr = parameterized_y.subs([[p1y,row['P1Y']],[p2y,row['P2Y']]])
    expr=sqrt((px_expr-centerp[0])**2 (py_expr-centerp[1])**2)-radius
    df.at[index,'result']=solveset(expr>=0,t,domain=S.Reals)

end_whole_process=time.time()
print("end_whole_process - start_whole_process", end_whole_process - start_whole_process)
  • Related