I want to convert all boolean columns in my pandas dataframe into 0 and 1 by using pd.get_dummies. However, the boolean values stay the same after the get_dummies function.
For example:
tmp = pd.DataFrame([
['green' , True],
['red' , False],
['blue' , True]])
tmp.columns = ['color', 'class']
pd.get_dummies(tmp)
# I have also tried pd.get_dummies(tmp, dtype=int), but got the same output
I got:
class color_blue color_green color_red
0 True 0 1 0
1 False 0 0 1
2 True 1 0 0
but I need:
color_blue color_green color_red class_True class_False
0 0 1 0 1 0
1 0 0 1 0 1
2 1 0 0 1 0
update: My dataframe includes numeric data so convert all columns in the dataframe into string may not be the best soluion.
CodePudding user response:
For processing boolean columns convert them to strings (here are converted all columns):
print (pd.get_dummies(tmp.astype(str)))
color_blue color_green color_red class_False class_True
0 0 1 0 0 1
1 0 0 1 1 0
2 1 0 0 0 1
Of convert only boolean:
print (pd.get_dummies(tmp.astype({'class':'str'})))
color_blue color_green color_red class_False class_True
0 0 1 0 0 1
1 0 0 1 1 0
2 1 0 0 0 1
You can create dictionary only for boolean columns:
d = dict.fromkeys(tmp.select_dtypes('bool').columns, 'str')
print (pd.get_dummies(tmp.astype(d)))