I want to generate a column 'd', it contains column variable name where the value is equal to 1.
df=pd.DataFrame({'a':[1,0,1],'b':[1,1,np.nan],'c':[1,1,1]})
Expected output:
df=pd.DataFrame({'a':[1,0,1],'b':[1,1,np.nan],'c':[1,1,1],'d':['a;b;c','b;c','a;c']})
CodePudding user response:
You can try DataFrame.apply
on rows
df['d'] = df.apply(lambda row: ';'.join(row.index[row.eq(1)]), axis=1)
print(df)
a b c d
0 1 1.0 1 a;b;c
1 0 1.0 1 b;c
2 1 NaN 1 a;c
Or mask
the column header dataframe
df['d'] = (pd.DataFrame([df().columns]*len(df), columns=df.columns)[df.eq(1)]
.apply(lambda row: ';'.join(row.dropna()), axis=1))