Is it possible to one-hot encode a pandas dataframe by numerical values? It seems get_dummies()
only works for string data.
For example, I'm hoping to do this:
>>> df = pd.DataFrame({'A': [0, 1, 0, 2], 'B': [0, 0, 1, 0]})
>>> df
A B
0 0 0
1 1 0
2 0 1
3 2 0
>>> df_oh = SomeFunction(df) # Does SomeFunction() exist?
>>> df_oh
A_0 A_1 A_2 B_0 B_1
0 1 0 0 1 0
1 0 1 0 1 0
2 1 0 0 0 1
3 0 0 1 1 0
Thanks!
CodePudding user response:
You can use pd.get_dummies()
after converting df
to string dtypes with DataFrame.astype()
, as follows:
pd.get_dummies(df.astype(str))
Result:
A_0 A_1 A_2 B_0 B_1
0 1 0 0 1 0
1 0 1 0 1 0
2 1 0 0 0 1
3 0 0 1 1 0