I am trying to match conditions so that if text is present in both columns A and B and a 0 is in column C, the code should return 'new' in column C (overwriting the 0). Example dataframe below:
import pandas as pd
df = pd.DataFrame({"A":['something',None,'filled',None], "B":['test','test','test',None], "C":['rt','0','0','0']})
I have tried the following, however it only seems to accept the last condition so that any '0' entries in column C become 'new' regardless of None in columns A or B. (in this example I only expect 'new' to appear on row 2.
import numpy as np
conditions = [(df['A'] is not None) & (df['B'] is not None) & (df['C'] == '0')]
values = ['new']
df['C'] = np.select(conditions, values, default=df["C"])
Appreciate any help!
CodePudding user response:
You will need to use .isna()
and filter where it is not nan/none (using ~
) as below:
conditions = [~(df['A'].isna()) & ~(df['B'].isna()) & (df['C'] == '0')]
output:
A B C
0 something test rt
1 None test 0
2 filled test new
3 None None 0
CodePudding user response:
Use Series.notna
for test None
or NaN
s:
conditions = [df['A'].notna() & df['B'].notna() & (df['C'] == '0')]
Or:
conditions = [df[['A','B']].notna().all(axis=1) & (df['C'] == '0')]
values = ['new']
df['C'] = np.select(conditions, values, default=df["C"])
print (df)
A B C
0 something test rt
1 None test 0
2 filled test new
3 None None 0
CodePudding user response:
Use
mask = df[['A', 'B']].notna().all(1) & df['C'].eq('0')
df.loc[mask, 'C'] = 'new'