Home > OS >  Create a column that contains a list of nan columns in the row
Create a column that contains a list of nan columns in the row

Time:10-29

Given the follow dataframe:

ID | Y1   |   Y2   |   Y3|
--- ------ -------- ----- 
0  | NaN  |   8    |   4 |
1  | NaN  |   NaN  |   1 |
2  | NaN  |   NaN  |  NaN|
3  | 5    |   3    |  NaN|

I would like to create an extra column containing all columns with nan in that row, like this:

ID | Y1   |   Y2   |   Y3| Result          |
--- ------ -------- ----- ----------------- 
0  | NaN  |   8    |   4 |   ['Y1']        |
1  | NaN  |   NaN  |   1 | ['Y1','Y2']     |
2  | NaN  |   NaN  | NaN |['Y1','Y2','Y3'] |
3  | 5    |   3    |   7 |     []          |

CodePudding user response:

We can do dot then split

s = df.filter(like='Y')
df['new'] = s.isna().dot(s.columns ',').str[:-1].str.split(',')
df
Out[81]: 
   ID   Y1   Y2   Y3           new
0   0  NaN  8.0  4.0          [Y1]
1   1  NaN  NaN  1.0      [Y1, Y2]
2   2  NaN  NaN  NaN  [Y1, Y2, Y3]
3   3  5.0  3.0  NaN          [Y3]
  • Related