I have this df:
data = {'book': [True, False, False, False, False],
'apple': [False, False, True, False, False],
'cat': [False, False, False, False, True],
'pigeon': [False, True, False, False, False],
'shirt': [False, False, False, True, False]}
df = pd.DataFrame(data)
Then I want create a new column, df['category']
that takes in as value, the column's name whose value is true.
So that df['category']
for each TRUE value column as follows:
book - stationery, apple - fruit, cat - animal, pigeon - bird, shirt - clothes
NO 2 columns have TRUE value in a row.
Expected output:
>>> df
book apple cat pigeon shirt category
0 True False False False False stationery
1 False False False True False bird
2 False True False False False fruit
3 False False False False True clothes
4 False False True False False animal
CodePudding user response:
Simple..use idxmax
along axis=1
to get the name of column having True
value, then map
the name to the corresponding category
d = {'book': 'stationery', 'pigeon': 'bird',
'apple': 'fruit', 'shirt': 'clothes', 'cat': 'animal'}
df['category'] = df.idxmax(1).map(d)
book apple cat pigeon shirt category
0 True False False False False stationery
1 False False False True False bird
2 False True False False False fruit
3 False False False False True clothes
4 False False True False False animal