I have a data set as below
ID A1 A2
0 A123 1234
1 1234 5568
2 5568 NaN
3 Zabc NaN
4 3456 3456
5 3456 3456
6 NaN NaN
7 NaN NaN
Intention is to go through each column (A1 and A2), identify where both the columns are blank as in row 6 and 7, create a new column and categorise as "Both A1 and A2 are blank"
I used the below code,
df['Z_Tax No Not Mapped'] = np.NaN
df['Z_Tax No Not Mapped'] = np.where((df['A1'] == np.NaN) & (df['A2'] == np.NaN), 1, 0)
However the output captures all the rows as 0 under new column 'Z_Tax No Not Mapped', but the data have instances where both the columns are blank. Not sure where i'm making a mistake to filter such cases.
Note: Columns A1 and A2 are sometimes alphanumeric or just numeric
Idea is to place a category in a separate column as "IDs are not updated" or "IDs are updated", so that by placing a simple filter on "IDs are not updated" we can identify cases that are blank in both columns
Help would be much appreciated.
CodePudding user response:
Use DataFrame.isna
with DataFrame.all
for test if all columns are True
s - missing values:
df['Z_Tax No Not Mapped'] = np.where(df[['A1','A2']].isna().all(axis=1),
'Both A1 and A2 are blank',
'')
CodePudding user response:
df.loc[df.isna().all(axis=1), "Z_Tax No Not Mapped"] = "Both A1 and A2 are blank"