Home > Mobile >  Replace a column value of one dataframe with a column value of another dataframe if the absolute dif
Replace a column value of one dataframe with a column value of another dataframe if the absolute dif

Time:12-09

I have one data frame, df1 with values as given below.

df1

Var1
105.1290
52.7885
10.9922
22.8443
73.5880
96.2842
91.9474
41.6485
68.4402
89.7366

df2

Var2
84.95632934
97.58280346

Now, I replace the values in df1 with the values in df2 whose absolute difference is low like the ones marked below. diff1 is obtained by subtracting values in df1 from first value in df2 (i.e. 84.95632934). Similarly, diff2 is obtained by subtracting values in df1 from second value in df2 (i.e. 97.58280346). diff1 column shows that the difference is least for 89.7366 value in df1 so it is replaced with 84.95632934. Similarly diff2 shows the absolute difference is least for 96.2842 value in df1 so it is replaced with 97.58280346.

Var1      diff1      diff2
105.1290  20.1727    7.5462
52.7885  -32.1678   -44.7943
10.9922  -73.9641   -86.5906
22.8443  -62.1120   -74.7385
73.5880  -11.3683   -23.9948
96.2842*  11.3279   -1.2986
91.9474   6.9911    -5.6354
41.6485  -43.3078   -55.9343
68.4402  -16.5161   -29.1426
89.7366*  4.7803    -7.8462

The final dataframe as a result should be as shown in table df3. How can I obtain df3?

df3

Var1
105.1290
52.7885
10.9922
22.8443
73.5880
97.5828
91.9474
41.6485
68.4402
84.9563

CodePudding user response:

Use array broadcasting to compute the differences and set the values with .loc and idxmin:

other = df2["Var2"].to_numpy()
differences = pd.DataFrame(df1['Var1'].to_numpy()[:, None] - other).abs()
df1["Var1"].loc[differences.idxmin()] = other

>>> df1
         Var1
0  105.129000
1   52.788500
2   10.992200
3   22.844300
4   73.588000
5   97.582803
6   91.947400
7   41.648500
8   68.440200
9   84.956329
  • Related