Home > database >  Category Condition on multiple columns
Category Condition on multiple columns

Time:06-01

I have a data set as below

ID  A1      A2
0   A123     1234
1   1234     5568
2   5568     NaN
3   Zabc     NaN
4   3456     3456
5   3456     3456
6   NaN    NaN
7   NaN    NaN

Intention is to go through each column (A1 and A2), identify where both the columns are blank as in row 6 and 7, create a new column and categorise as "Both A1 and A2 are blank"

I used the below code,

df['Z_Tax No Not Mapped'] = np.NaN

df['Z_Tax No Not Mapped'] = np.where((df['A1'] == np.NaN) & (df['A2'] == np.NaN), 1, 0)

However the output captures all the rows as 0 under new column 'Z_Tax No Not Mapped', but the data have instances where both the columns are blank. Not sure where i'm making a mistake to filter such cases.

Note: Columns A1 and A2 are sometimes alphanumeric or just numeric

Idea is to place a category in a separate column as "IDs are not updated" or "IDs are updated", so that by placing a simple filter on "IDs are not updated" we can identify cases that are blank in both columns

Help would be much appreciated.

CodePudding user response:

Use DataFrame.isna with DataFrame.all for test if all columns are Trues - missing values:

df['Z_Tax No Not Mapped'] = np.where(df[['A1','A2']].isna().all(axis=1),
                                     'Both A1 and A2 are blank', 
                                      '')

CodePudding user response:

df.loc[df.isna().all(axis=1), "Z_Tax No Not Mapped"] = "Both A1 and A2 are blank"
  • Related