I am trying to find the farthest point by coordinates. I have 2 data tables:
data={'X':[1,1,1,3,4],'Y':[1,2,4,3,5]}
data=pd.DataFrame(data)
points={'ID':['1','2','3','4'],'X':[1,2,4,5],'Y':[3,3,4,1]}
points=pd.DataFrame(points)
I would like to determine which point from the "points" table is the farthest one from the coordinates included in the "data" table. The calculation I wish to use is the vector distance as follows.
d = √(x2 −x1)2 (y2 −y1)2
Based on the example, this is my "data": photo in link And this is "points" table: photo in link
I would like to indicate that the ID 4 in the "points" table is the farthest point (red one in the picture) from the coordinates in the "data" table. What I've tried:
points['DISTANCE']=data.apply(lambda x: (np.sqrt(((x['X']-points['X'])**2) ((x['Y']-points['Y'])**2))).max(), axis=1)
Unfortunately the code returns a wrong result, not what I expected. ID 4 should have the most value in DISTANCE column.
ID | X | Y | DISTANCE |
---|---|---|---|
1 | 1 | 3 | 4.242641 |
2 | 2 | 3 | 3.605551 |
3 | 4 | 4 | 3.000000 |
4 | 1 | 5 | 2.828427 |
I am asking for help in solving this problem.
CodePudding user response:
You can use scipy.spatial.distance.cdist - see this answer for more details.
Then you can use np.argmax to find which is farthest.
HTH.
CodePudding user response:
I think you want to do something like this:
data={'X':[1,1,1,3,4],'Y':[1,2,4,3,5]}
data=pd.DataFrame(data)
points={'ID':['1','2','3','4'],'X':[1,2,4,5],'Y':[3,3,4,1]}
points=pd.DataFrame(points)
def distance(x1, x2, y1, y2):
return np.sqrt(
((x2-x1)**2) ((y2-y1)**2)
)
results = []
for n, row in points.iterrows():
id = row.ID
x1 = row.X
y1 = row.Y
for _n, _row in data.iterrows():
x2 = _row.X
y2 = _row.Y
d = distance(x1, x2, y1, y2)
results.append((d, id))
print(f'{id} Point({x1},{y1}) vs Data({x2},{y2}) -> {d}')
Output:
1 Point(1,3) vs Data(1,1) -> 2.0
1 Point(1,3) vs Data(1,2) -> 1.0
1 Point(1,3) vs Data(1,4) -> 1.0
1 Point(1,3) vs Data(3,3) -> 2.0
1 Point(1,3) vs Data(4,5) -> 3.605551275463989
2 Point(2,3) vs Data(1,1) -> 2.23606797749979
2 Point(2,3) vs Data(1,2) -> 1.4142135623730951
2 Point(2,3) vs Data(1,4) -> 1.4142135623730951
2 Point(2,3) vs Data(3,3) -> 1.0
2 Point(2,3) vs Data(4,5) -> 2.8284271247461903
3 Point(4,4) vs Data(1,1) -> 4.242640687119285
3 Point(4,4) vs Data(1,2) -> 3.605551275463989
3 Point(4,4) vs Data(1,4) -> 3.0
3 Point(4,4) vs Data(3,3) -> 1.4142135623730951
3 Point(4,4) vs Data(4,5) -> 1.0
4 Point(5,1) vs Data(1,1) -> 4.0
4 Point(5,1) vs Data(1,2) -> 4.123105625617661
4 Point(5,1) vs Data(1,4) -> 5.0
4 Point(5,1) vs Data(3,3) -> 2.8284271247461903
4 Point(5,1) vs Data(4,5) -> 4.123105625617661
To get the most distant:
max_distance, id_max_distance = sorted(results)[-1]
print(max_distance) #5.0
print(id_max_distance) #4