Update values in dataframe based on dictionary and condition-CodePudding

I have a dataframe and a dictionary that contains some of the columns of the dataframe and some values. I want to update the dataframe based on the dictionary values, and pick the higher value.

>>> df1

    a   b   c   d   e   f
0   4   2   6   2   8   1
1   3   6   7   7   8   5
2   2   1   1   6   8   7
3   1   2   7   3   3   1
4   1   7   2   6   7   6
5   4   8   8   2   2   1

and the dictionary is

compare = {'a':4, 'c':7, 'e':3}

So I want to check the values in columns ['a','c','e'] and replace with the value in the dictionary, if it is higher.

What I have tried is this:

comp = pd.DataFrame(pd.Series(compare).reindex(df1.columns).fillna(0)).T

df1[df1.columns] = df1.apply(lambda x: np.where(x>comp, x, comp)[0] ,axis=1)

Excepted Output:

>>>df1


    a   b   c   d   e   f
0   4   2   7   2   8   1
1   4   6   7   7   8   5
2   4   1   7   6   8   7
3   4   2   7   3   3   1
4   4   7   7   6   7   6
5   4   8   8   2   3   1

CodePudding user response：

limits = df.columns.map(compare).to_series(index=df.columns)
new    = df.mask(df < limits, limits, axis=1)

obtain a Series whose index is columns of df and values from the dictionary
check if the frame's values are less then the "limits"; if so, put what limits have; else, as is

to get

>>> new

   a  b  c  d  e  f
0  4  2  7  2  8  1
1  4  6  7  7  8  5
2  4  1  7  6  8  7
3  4  2  7  3  3  1
4  4  7  7  6  7  6
5  4  8  8  2  3  1

CodePudding user response：

Another possible solution, based on numpy:

cols = list(compare.keys())
df[cols] = np.maximum(df[cols].values, np.array(list(compare.values())))

Output:

   a  b  c  d  e  f
0  4  2  7  2  8  1
1  4  6  7  7  8  5
2  4  1  7  6  8  7
3  4  2  7  3  3  1
4  4  7  7  6  7  6
5  4  8  8  2  3  1