I have a dataframe with several columns and rows. E.g. minimal example to reproduce:
df2 = pd.DataFrame( {'tof': {4875: 553.347572418771, 4876: 554.4911639867169, 4877: 556.2920840265602, 4878: 560.4059228421942, 4879: 560.7254631042018, 4880: 563.528024491697, 4881: 566.9995989172061, 4882: 570.7775523817776, 4883: 572.0685266789887}, 'E': {4875: 21.390025636898983, 4876: 21.301354836365054, 4877: 21.16283071077996, 4878: 20.85142476306085, 4879: 20.82752461774972, 4880: 20.61965532001513, 4881: 20.366451134218167, 4882: 20.096164210524464, 4883: 20.005036194577794}})
tof E
4875 553.347572 21.390026
4876 554.491164 21.301355
4877 556.292084 21.162831
4878 560.405923 20.851425
4879 560.725463 20.827525
4880 563.528024 20.619655
4881 566.999599 20.366451
4882 570.777552 20.096164
4883 572.068527 20.005036
I need to select 2 closest points (rows) using some provided value E_provided.
E.g. E_provided = 20.83
closest point(row) left to some value of E_search = closest_left gives
4879 560.725463 20.827525
closest point(row) right to some value of E_search = closest_right
4878 560.405923 20.851425
I understand that it's simple task, and I need to calculate some difference between df2['E']-E_provided and use additional conditions, but I don't get how to do it :)
I have tried this one:
df2.iloc[(df['E']-E_provided).abs().argsort()[:2]]
But it works incorrect because there are points that are close to the provided value just from one side, e.g. for E_provided == 20.1
it will output 2 nearest points both of them are left to E_provided value:
4882 570.777552 20.096164
4883 572.068527 20.005036
While I need to find:
4881 566.999599 20.366451
4882 570.777552 20.096164
CodePudding user response:
Find values greater than/less than your value:
E_provided = 20.83
mask = df2.E.gt(E_provided)
Use this to get the closet from each direction:
above_id = df2.loc[mask, 'E'].sub(E_provided).abs().idxmin()
below_id = df2.loc[~mask, 'E'].sub(E_provided).abs().idxmin()
Now we can look up those two rows:
out = df2.loc[[above_id, below_id]]
print(out)
Output:
tof E
4878 560.405923 20.851425
4879 560.725463 20.827525
CodePudding user response:
Less elegant, but working solution:
lower_ind = signal[signal['E'] < real_coord]['E'].idxmax()
upper_ind = signal[signal['E'] > real_coord]['E'].idxmin()
indx_list = [lower_ind, upper_ind]
out = df2.loc[indx_list]
print(out)