Home > OS >  Matching conditions in columns
Matching conditions in columns

Time:12-07

I am trying to match conditions so that if text is present in both columns A and B and a 0 is in column C, the code should return 'new' in column C (overwriting the 0). Example dataframe below:

import pandas as pd
df = pd.DataFrame({"A":['something',None,'filled',None], "B":['test','test','test',None], "C":['rt','0','0','0']})

I have tried the following, however it only seems to accept the last condition so that any '0' entries in column C become 'new' regardless of None in columns A or B. (in this example I only expect 'new' to appear on row 2.

import numpy as np

conditions = [(df['A'] is not None) & (df['B'] is not None) & (df['C'] == '0')]

values = ['new']

df['C'] = np.select(conditions, values, default=df["C"])

Appreciate any help!

CodePudding user response:

You will need to use .isna() and filter where it is not nan/none (using ~) as below:

conditions = [~(df['A'].isna()) & ~(df['B'].isna()) & (df['C'] == '0')]

output:

           A     B    C
0  something  test   rt
1       None  test    0
2     filled  test  new
3       None  None    0

CodePudding user response:

Use Series.notna for test None or NaNs:

conditions = [df['A'].notna() & df['B'].notna() & (df['C'] == '0')]

Or:

conditions = [df[['A','B']].notna().all(axis=1) & (df['C'] == '0')]

values = ['new']

df['C'] = np.select(conditions, values, default=df["C"])
print (df)
           A     B    C
0  something  test   rt
1       None  test    0
2     filled  test  new
3       None  None    0

CodePudding user response:

Use

mask = df[['A', 'B']].notna().all(1) & df['C'].eq('0')
df.loc[mask, 'C'] = 'new'
  • Related