Home > other >  Replacing value of a column conditioned on values of other columns in the DataFrame
Replacing value of a column conditioned on values of other columns in the DataFrame

Time:07-29

I have a DataFrame with some columns containing boolean values. I would like to change the values in one of these boolean columns if I get a True value for any of the other boolean columns.

My sample DataFrame:

from pandas import DataFrame

names = {'First_name': ['Jon','Bill','Maria','Emma'], 'Last_name': 
                       ['Bobs', 'Vest', 'Gong', 'Hill'],
                       'Roll': ['Yes', 'Present', 'No', 'Absent']}

df = DataFrame(names)

keys = ['Jon', 'Maria', 'Gong', 'Hill', 'Present', 'No']

pattern = r"(?i)"   "|".join(keys)
df['bool1'] = df['First_name'].str.contains(pattern)
df['bool2'] = df['Last_name'].str.contains(pattern)
df['bool3'] = df['Roll'].str.contains(pattern)
df

Output:

First_name  Last_name  Roll    bool1    bool2   bool3
0   Jon     Bobs       Yes      True    False   False
1   Bill    Vest       Present  False   False   True
2   Maria   Gong       No       True    True    True
3   Emma    Hill       Absent   False   True    False

Objective: value of bool1 = False if (bool2 = True Or bool3 = True)

I tried the following code:

df['bool1'] = df.loc[(df['bool2'] == 'True') | (df['bool3'] == 'True'),                          
'bool1'] = 'False'
df

But it changes all the values in the column bool1 to False, instead of just row 3 in bool1.

That is, the wrong output is:

    First_name  Last_name   Roll    bool1   bool2   bool3   bool
0   Jon         Bobs        Yes     False   False   False   False
1   Bill        Vest        Present False   False   True    False
2   Maria       Gong        No      False   True    True    False
3   Emma        Hill        Absent  False   True    False   False

I tried the codes with dataframe.mask too, but it doesn't work in the intended way. I would be grateful if someone could help me with the code. Thanks in advance.

CodePudding user response:

You made double assignment. You can just keep

df.loc[(df['bool2'] == True) | (df['bool3'] == True), 'bool1'] = False                       

Output:

     First_name Last_name     Roll  bool1  bool2  bool3
0        Jon      Bobs      Yes   True  False  False
1       Bill      Vest  Present  False  False   True
2      Maria      Gong       No  False   True   True
3       Emma      Hill   Absent  False   True  False

Note: you don't have to use quotes with True and False.

  • Related