I have the following dataframe :
id | Statement_1 | Statement_2 | Statement_3 |
---|---|---|---|
xyz | 0.0 | 1.0 | 0.0 |
abc | 1.0 | 0.0 | 0.0 |
efg | 0.0 | 0.0 | 1.0 |
I want to create a new column based on the condition, if a row have a 1.0, it return the column name associate with it.
More clearly, I would like this dataframe in return :
id | Statement_1 | Statement_2 | Statement_3 | Condition |
---|---|---|---|---|
xyz | 0.0 | 1.0 | 0.0 | Statement_2 |
abc | 1.0 | 0.0 | 0.0 | Statement_1 |
efg | 0.0 | 0.0 | 1.0 | Statement_3 |
Thx to everyone who can help me with this !
CodePudding user response:
Use idxmax
if you are sure to have at least 1 per row:
df['Condition'] = df.filter(like='Statement').idxmax(axis=1)
print(df)
# Output
id Statement_1 Statement_2 Statement_3 Condition
0 xyz 0.0 1.0 0.0 Statement_2
1 abc 1.0 0.0 0.0 Statement_1
2 efg 0.0 0.0 1.0 Statement_3
A more robust way for case where there are 0, 1 or more 1 in the row:
df['Condition'] = df.filter(like='Statement').melt(ignore_index=False) \
.query('value == 1').groupby(level=0)['variable'] \
.apply(', '.join)
CodePudding user response:
If possible multiple 1
values use matrix multiplication with