I have a panda's dataframe that is something like this
el1 el2 x el3
n m 6 f
n2 m2 7 f2
....
n10 m10 19.3 f10
n11 m11 21 f11
....
The el1, el2, and el3 do not matter at all. I want to find the row with the X nearest to x=20
so I do
min_index=df['x'].sub(20).abs().idxmin()
which gives me the index where x=19.3 is
So far so good
Now, the real problem
Imagine I have a dataframe where x goes from 6 to 31 and then again from 7 to 45 and then again from 2 to 27
I want a way to find a list of indexes [idx1, idx2...idxn]
(in the above example n would be 3) with the most approximate values to say for example 20.
How can I do this with pandas and python?
EDIT: An example of the df would be
el1 el2 x el3
n m 6 f
n2 m2 7 f2
....
n10 m10 19.3 f10
n11 m11 21 f11
....
n20 m20 31 f20
n21 m21 7 f21
n22 m22 8.1 f22
....
n29 m29 19.8 f29
n30 m30 21 f30
...
n35 m35 45 f35
n36 m36 2 f36
n37 m37 3 f37
....
n45 m45 19.9 f45
n46 m46 22 f46
...
n50 m50 27 f50
The rows I want are the ones where x=19.3 x=19.8 x=x=19.9
CodePudding user response:
Assuming you have consecutive stretches of increasing values and want to find the closest to 20 for each:
group = df['x'].diff().lt(0).cumsum()
out = df.loc[df['x'].sub(20).abs().groupby(group).idxmin()]
example input:
import numpy as np
df = pd.DataFrame({'x': np.r_[np.linspace(6,31,5),
np.linspace(7,45,5),
np.linspace(2,27,5),
],
'el1': '-'
})
output:
x el1
2 18.50 -
6 16.50 -
13 20.75 -