What is the cleanest way to return the closest matched value to [reference] from [ABCD] columns.
Output is the closest value. e.g. for the first row, absolute delta is [19 40 45 95] so the closest value to return is -21.
df1 = pd.DataFrame(np.random.randint(-100,300,size=(100, 4)), columns=list('ABCD')) # Generate Random Dataframe
df2 = pd.DataFrame(np.random.randint(-100,100,size=(100, 1)), columns=['reference'])
df = pd.concat([df1,df2], axis=1)
df['closest_value'] = "?"
df
CodePudding user response:
You can apply a lambda function on rows and get the closest value from the desired columns based on absolute difference from the reference
column
df['closest_value'] = (df
.apply(
lambda x: x.values[(np.abs(x[[i for i in x.index if i != 'reference']].values
- x['reference'])).argmin()]
, axis=1)
)
OUTPUT:
A B C D reference closest_value
0 -2 227 -88 268 -68 -88
1 185 182 18 279 -59 18
2 140 40 264 98 61 40
3 0 98 -32 81 47 81
4 -6 70 -6 -9 -53 -9
.. ... ... ... ... ... ...
95 -29 -34 141 166 -76 -34
96 14 22 175 205 69 22
97 265 11 -25 284 -88 -25
98 283 31 -91 252 11 31
99 6 -59 84 95 -15 6
[100 rows x 6 columns]
CodePudding user response:
Try this :
idx = df.drop(['reference'], axis=1).sub(df.reference, axis=0).abs().idxmin(1)
df['closest_value'] = df.lookup(df.index, idx)