I have a dataframe with many dummy variables. Instead of having a lot of different dummy columns, I want only one column and each row needs to contain a string with only the dummy variable equal to 1.
index a b c
0 1 1 1
1 0 0 1
Output:
index dummies
0 ['a','b','c']
1 ['c']
CodePudding user response:
dummies = df.apply(lambda x: [col for col in df.columns if x[col] == 1], axis=1)
CodePudding user response:
You can stack and use groupby:
df.where(df.eq(1)).stack().reset_index(level=1).groupby(level=0)['level_1'].agg(list)
Output:
0 [a, b, c]
1 [c]
Name: level_1, dtype: object