I have a dataframe similar to this:
import pandas as pd
id = [1001, 1002, 1003]
a = [156, 224, 67]
b = [131, 203, 61]
c = [97, 165, 54]
d = [68, 122, 50]
value = [71, 180, 66]
df = pd.DataFrame({'id':id, 'a':a, 'b':b, 'c':c, 'd':d, 'value':value})
id | a | b | c | d | value |
---|---|---|---|---|---|
1001 | 156 | 131 | 97 | 68 | 71 |
1002 | 224 | 203 | 165 | 122 | 180 |
1003 | 67 | 61 | 54 | 50 | 66 |
For each row, I would like to evaluate columns a-d and within them identify the next lowest and next highest values, as compared to value. So in this example, the expected result would look like:
id | a | b | c | d | value | nxt_low | nxt_high |
---|---|---|---|---|---|---|---|
1001 | 156 | 131 | 97 | 68 | 71 | 68 | 97 |
1002 | 224 | 203 | 165 | 122 | 180 | 165 | 203 |
1003 | 67 | 61 | 54 | 50 | 66 | 61 | 67 |
I have tried creating a single column with a numpy array from a-d and trying to do some operations that way, but I'm not applying it correctly and have been unable to get the desired result. Any help is greatly appreciated.
CodePudding user response:
you can get nearest low following code:
df.apply(lambda x: x[x < x[-1]].max(), axis=1)
output:
0 68
1 165
2 61
dtype: int64
get nearest low and high and make result to columns:
df[['nxt_low', 'nxt_high']] = df.apply(lambda x: [x[x < x[-1]].max(), x[x > x[-1]].min()], axis=1, result_type='expand')
df
:
id a b c d value nxt_low nxt_high
0 1001 156 131 97 68 71 68 97
1 1002 224 203 165 122 180 165 203
2 1003 67 61 54 50 66 61 67
If id
is nearest low or high, modify code a bit.
df[['nxt_low', 'nxt_high']] = df.iloc[:, 1:].apply(lambda x: [x[x < x[-1]].max(), x[x > x[-1]].min()], axis=1, result_type='expand')