Home > other >  How to select "nearest left" and "nearesr right" values from a dataframe using c
How to select "nearest left" and "nearesr right" values from a dataframe using c

Time:12-31

I have a dataframe with several columns and rows. E.g. minimal example to reproduce:

df2 = pd.DataFrame( {'tof': {4875: 553.347572418771, 4876: 554.4911639867169, 4877: 556.2920840265602, 4878: 560.4059228421942, 4879: 560.7254631042018, 4880: 563.528024491697, 4881: 566.9995989172061, 4882: 570.7775523817776, 4883: 572.0685266789887}, 'E': {4875: 21.390025636898983, 4876: 21.301354836365054, 4877: 21.16283071077996, 4878: 20.85142476306085, 4879: 20.82752461774972, 4880: 20.61965532001513, 4881: 20.366451134218167, 4882: 20.096164210524464, 4883: 20.005036194577794}})
             tof          E
4875  553.347572  21.390026
4876  554.491164  21.301355
4877  556.292084  21.162831
4878  560.405923  20.851425
4879  560.725463  20.827525
4880  563.528024  20.619655
4881  566.999599  20.366451
4882  570.777552  20.096164
4883  572.068527  20.005036

I need to select 2 closest points (rows) using some provided value E_provided.

E.g. E_provided = 20.83

closest point(row) left to some value of E_search = closest_left gives

4879  560.725463  20.827525

closest point(row) right to some value of E_search = closest_right

4878  560.405923  20.851425

I understand that it's simple task, and I need to calculate some difference between df2['E']-E_provided and use additional conditions, but I don't get how to do it :)

I have tried this one:

df2.iloc[(df['E']-E_provided).abs().argsort()[:2]]

But it works incorrect because there are points that are close to the provided value just from one side, e.g. for E_provided == 20.1 it will output 2 nearest points both of them are left to E_provided value:

4882  570.777552  20.096164
4883  572.068527  20.005036

While I need to find:

4881  566.999599  20.366451
4882  570.777552  20.096164

CodePudding user response:

Find values greater than/less than your value:

E_provided = 20.83
mask = df2.E.gt(E_provided)

Use this to get the closet from each direction:

above_id = df2.loc[mask, 'E'].sub(E_provided).abs().idxmin()
below_id = df2.loc[~mask, 'E'].sub(E_provided).abs().idxmin()

Now we can look up those two rows:

out = df2.loc[[above_id, below_id]]
print(out)

Output:

             tof          E
4878  560.405923  20.851425
4879  560.725463  20.827525

CodePudding user response:

Less elegant, but working solution:

lower_ind = signal[signal['E'] < real_coord]['E'].idxmax()
upper_ind = signal[signal['E'] > real_coord]['E'].idxmin()

indx_list = [lower_ind, upper_ind]

out = df2.loc[indx_list]
print(out)
  • Related