I have the following pandas dataframe:
A B C
0 1.0 NaN 2.0
1 NaN 1.0 4.0
2 7.0 1.0 2.0
I know I can get, for each row, the name of the first non-blank column with this script:
df['first'] = df.dropna(how='all').notna().idxmax(axis=1).astype('string')
but how can I get the name of the second non blank column? This is the expected output:
A B C first second
0 1.0 NaN 2.0 A C
1 NaN 1.0 4.0 B C
2 7.0 1.0 2.0 A B
Thanks
CodePudding user response:
You can drop the NaNs with apply:
df[['first', 'second']] = df.apply(lambda x: pd.Series(x.dropna().index), axis=1)
Output:
A B C first second
0 1.0 NaN 2.0 A C
1 NaN 1.0 4.0 B C
2 7.0 1.0 NaN A B