I'm trying to create a parametrization of points in space from a specific point according to a specific inequality.
I'm doing it using Sympy.solevset
method while the calculation will return an interval of the parameter t
that represents all points between those in my dataframe.
Sadly, performing a Sympy.solveset
over 13 sets of values (i.e 13 iterations) leads to execution times of over 20 seconds overall, and over 1 sec calculation time per set.
The code:
from sympy import *
from sympy import S
from sympy.solvers.solveset import solveset, solveset_real
import pandas as pd
import time
t=symbols('t',positive=True)
p1x,p1y,p2x,p2y=symbols('p1x p1y p2x p2y')
centerp=[10,10]
radius=5
data={'P1X':[0,1,2,3,1,2,3,1,2,3,1,2,3],'P1Y':[3,2,1,0,1,2,3,1,2,3,1,2,3],'P2X':[3,8,2,4,1,2,3,1,2,3,1,2,3],'P2Y':[3,9,10,7,1,2,3,1,2,3,1,2,3],'result':[0,0,0,0,0,0,0,0,0,0,0,0,0]}
df=pd.DataFrame(data)
parameterized_x=p1x t*(p2x-p1x)
parameterized_y=p1y t*(p2y-p1y)
start_whole_process=time.time()
overall_time=0
for index,row in df.iterrows():
parameterized_x.subs([[p1x,row['P1X']],[p2x,row['P2X']]])
parameterized_y.subs([[p1y,row['P1Y']],[p2y,row['P2Y']]])
expr=sqrt((parameterized_x-centerp[0])**2 (parameterized_y-centerp[1])**2)-radius
start=time.time()
df.at[index,'result']=solveset(expr>=0,t,domain=S.Reals)
end=time.time()
overall_time=overall_time end-start
end_whole_process=time.time()
I need to know if there's a way to enhance calculation time or maybe there is another package that can preform a specific inequality over large quantities of data without having to wait minutes upon minutes.
CodePudding user response:
There is one big mistake in your current approach than needs to be fixed first. Inside your for loop you did:
parameterized_x.subs([[p1x,row['P1X']],[p2x,row['P2X']]])
parameterized_y.subs([[p1y,row['P1Y']],[p2y,row['P2Y']]])
expr=sqrt((parameterized_x-centerp[0])**2 (parameterized_y-centerp[1])**2)-radius
This is wrong: SymPy expressions cannot be modified in place. This leads your expr
to be exactly the same for each row, namely:
# sqrt((p1x t*(-p1x p2x) - 10)**2 (p1y t*(-p1y p2y) - 10)**2) - 5
Then, solveset
tries to solve the same expression on each row. Because this expression contains 3 symbols, solveset
takes a long time trying to compute the solution, eventually producing the same answer for each row:
# ConditionSet(t, sqrt((p1x t*(-p1x p2x) - 10)**2 (p1y t*(-p1y p2y) - 10)**2) - 5 >= 0, Complexes)
Remember: every operation you apply to a SymPy expression creates a new SymPy expression. So, the above code has to be modified to:
px_expr = parameterized_x.subs([[p1x,row['P1X']],[p2x,row['P2X']]])
py_expr = parameterized_y.subs([[p1y,row['P1Y']],[p2y,row['P2Y']]])
expr=sqrt((px_expr-centerp[0])**2 (py_expr-centerp[1])**2)-radius
In doing so, expr
is different for each row, as it is expected. Then, solveset
computes different solutions, and it is much much faster.
Here is your full example:
from sympy import *
from sympy.solvers.solveset import solveset, solveset_real
import pandas as pd
import time
t=symbols('t',positive=True)
p1x,p1y,p2x,p2y=symbols('p1x p1y p2x p2y')
centerp=[10,10]
radius=5
data={'P1X':[0,1,2,3,1,2,3,1,2,3,1,2,3],'P1Y':[3,2,1,0,1,2,3,1,2,3,1,2,3],'P2X':[3,8,2,4,1,2,3,1,2,3,1,2,3],'P2Y':[3,9,10,7,1,2,3,1,2,3,1,2,3],'result':[0,0,0,0,0,0,0,0,0,0,0,0,0]}
df=pd.DataFrame(data)
parameterized_x=p1x t*(p2x-p1x)
parameterized_y=p1y t*(p2y-p1y)
start_whole_process=time.time()
overall_time=0
for index,row in df.iterrows():
px_expr = parameterized_x.subs([[p1x,row['P1X']],[p2x,row['P2X']]])
py_expr = parameterized_y.subs([[p1y,row['P1Y']],[p2y,row['P2Y']]])
expr=sqrt((px_expr-centerp[0])**2 (py_expr-centerp[1])**2)-radius
df.at[index,'result']=solveset(expr>=0,t,domain=S.Reals)
end_whole_process=time.time()
print("end_whole_process - start_whole_process", end_whole_process - start_whole_process)