Home > Enterprise >  Calculations based on values from 2 tables
Calculations based on values from 2 tables

Time:10-20

I am trying to find the farthest point by coordinates. I have 2 data tables:

data={'X':[1,1,1,3,4],'Y':[1,2,4,3,5]}
data=pd.DataFrame(data)

points={'ID':['1','2','3','4'],'X':[1,2,4,5],'Y':[3,3,4,1]}
points=pd.DataFrame(points)

I would like to determine which point from the "points" table is the farthest one from the coordinates included in the "data" table. The calculation I wish to use is the vector distance as follows.

d = √(x2 −x1)2 (y2 −y1)2

Based on the example, this is my "data": photo in link And this is "points" table: photo in link

I would like to indicate that the ID 4 in the "points" table is the farthest point (red one in the picture) from the coordinates in the "data" table. What I've tried:

points['DISTANCE']=data.apply(lambda x: (np.sqrt(((x['X']-points['X'])**2) ((x['Y']-points['Y'])**2))).max(), axis=1)

Unfortunately the code returns a wrong result, not what I expected. ID 4 should have the most value in DISTANCE column.

ID X Y DISTANCE
1 1 3 4.242641
2 2 3 3.605551
3 4 4 3.000000
4 1 5 2.828427

I am asking for help in solving this problem.

CodePudding user response:

You can use scipy.spatial.distance.cdist - see this answer for more details.

Then you can use np.argmax to find which is farthest.

HTH.

CodePudding user response:

I think you want to do something like this:

data={'X':[1,1,1,3,4],'Y':[1,2,4,3,5]}
data=pd.DataFrame(data)

points={'ID':['1','2','3','4'],'X':[1,2,4,5],'Y':[3,3,4,1]}
points=pd.DataFrame(points)

def distance(x1, x2, y1, y2):
    return np.sqrt(
        ((x2-x1)**2)   ((y2-y1)**2)
    )
    
results = []
for n, row in points.iterrows():
    id = row.ID
    x1 = row.X
    y1 = row.Y
    for _n, _row in data.iterrows():
        x2 = _row.X
        y2 = _row.Y
        d = distance(x1, x2, y1, y2)
        results.append((d, id))
        print(f'{id} Point({x1},{y1}) vs Data({x2},{y2}) -> {d}')

Output:

1 Point(1,3) vs Data(1,1) -> 2.0
1 Point(1,3) vs Data(1,2) -> 1.0
1 Point(1,3) vs Data(1,4) -> 1.0
1 Point(1,3) vs Data(3,3) -> 2.0
1 Point(1,3) vs Data(4,5) -> 3.605551275463989
2 Point(2,3) vs Data(1,1) -> 2.23606797749979
2 Point(2,3) vs Data(1,2) -> 1.4142135623730951
2 Point(2,3) vs Data(1,4) -> 1.4142135623730951
2 Point(2,3) vs Data(3,3) -> 1.0
2 Point(2,3) vs Data(4,5) -> 2.8284271247461903
3 Point(4,4) vs Data(1,1) -> 4.242640687119285
3 Point(4,4) vs Data(1,2) -> 3.605551275463989
3 Point(4,4) vs Data(1,4) -> 3.0
3 Point(4,4) vs Data(3,3) -> 1.4142135623730951
3 Point(4,4) vs Data(4,5) -> 1.0
4 Point(5,1) vs Data(1,1) -> 4.0
4 Point(5,1) vs Data(1,2) -> 4.123105625617661
4 Point(5,1) vs Data(1,4) -> 5.0
4 Point(5,1) vs Data(3,3) -> 2.8284271247461903
4 Point(5,1) vs Data(4,5) -> 4.123105625617661

To get the most distant:

max_distance, id_max_distance = sorted(results)[-1]
print(max_distance) #5.0
print(id_max_distance) #4
  • Related