Home > Enterprise >  Pandas change column values from dictionary
Pandas change column values from dictionary

Time:10-12

I have the following to tables of boolean values:

df1 = pd.DataFrame(data={'a': [True, False, False], 
                         'b': [False, True, False]},
                         index=pd.Series([1, 2, 3], name='index'))
df2 = pd.DataFrame(data={'w': [True, False, False], 
                         'x': [False, True, False],
                         'y': [True, True, True],
                         'z': [True, False, True]},
                         index=pd.Series([1, 2, 3], name='index'))
index a b
1 True False
2 False True
3 False False
index w x y z
1 True False True True
2 False True True False
3 False False True True

and the following dictionary:

dic = {'a': ['w', 'x'], 'b': ['y', 'z']}

I want to update df2 so that if a column in df1 is True, the column names that correspond to that column in dic are flagged as False. For example, since column a is True for index 1, columns w and w for index 1 should be turned to False.

The resulting table should look like this:

index w x y z
1 False False True True
2 False True False False
3 False False True True

CodePudding user response:

If you only have True/False and only need to change True to False, this simple loop with masking should work for you:

NB. I am calling the dictionary dic as dict the python builtin to construct dictionaries

for k,v in dic.items():
    df2[v] = df2[v].mask(df1[k], False)

output:

           w      x      y      z
index                            
1      False  False   True   True
2      False   True  False  False
3      False  False   True   True

CodePudding user response:

We can reverse the key:value association in the dictionary to rename df2 so that the columns align with the names in df1. Then, use mask, to update values (aligned on columns) and set_axis to restore the column names:

# Avoiding using `dict` as a variable name
d = {'a': ['w', 'x'], 'b': ['y', 'z']}
df2 = (
    df2.rename(columns={v: k for k, lst in d.items() for v in lst})
        .mask(df1, False)
        .set_axis(df2.columns, axis=1)
)

df2:

           w      x      y      z
index                            
1      False  False   True   True
2      False   True  False  False
3      False  False   True   True
  • Related