Home > Net >  Get nearest low and high values across multiple dataframe columns
Get nearest low and high values across multiple dataframe columns

Time:12-01

I have a dataframe similar to this:

import pandas as pd

id = [1001, 1002, 1003]
a = [156, 224, 67]             
b = [131, 203, 61]          
c = [97, 165, 54]        
d = [68, 122, 50]
value = [71, 180, 66]

df = pd.DataFrame({'id':id, 'a':a, 'b':b, 'c':c, 'd':d, 'value':value})
id a b c d value
1001 156 131 97 68 71
1002 224 203 165 122 180
1003 67 61 54 50 66

For each row, I would like to evaluate columns a-d and within them identify the next lowest and next highest values, as compared to value. So in this example, the expected result would look like:

id a b c d value nxt_low nxt_high
1001 156 131 97 68 71 68 97
1002 224 203 165 122 180 165 203
1003 67 61 54 50 66 61 67

I have tried creating a single column with a numpy array from a-d and trying to do some operations that way, but I'm not applying it correctly and have been unable to get the desired result. Any help is greatly appreciated.

CodePudding user response:

you can get nearest low following code:

df.apply(lambda x: x[x < x[-1]].max(), axis=1)

output:

0     68
1    165
2     61
dtype: int64

get nearest low and high and make result to columns:

df[['nxt_low', 'nxt_high']] = df.apply(lambda x: [x[x < x[-1]].max(), x[x > x[-1]].min()], axis=1, result_type='expand')

df:

    id      a   b   c   d   value   nxt_low nxt_high
0   1001    156 131 97  68  71      68      97
1   1002    224 203 165 122 180     165     203
2   1003    67  61  54  50  66      61      67



If id is nearest low or high, modify code a bit.

df[['nxt_low', 'nxt_high']] = df.iloc[:, 1:].apply(lambda x: [x[x < x[-1]].max(), x[x > x[-1]].min()], axis=1, result_type='expand')
  • Related