Home > Net >  One-Hot Encode with Numerical Data in Pandas
One-Hot Encode with Numerical Data in Pandas

Time:09-24

Is it possible to one-hot encode a pandas dataframe by numerical values? It seems get_dummies() only works for string data.

For example, I'm hoping to do this:

>>> df = pd.DataFrame({'A': [0, 1, 0, 2], 'B': [0, 0, 1, 0]})
>>> df
   A  B
0  0  0
1  1  0
2  0  1
3  2  0

>>> df_oh = SomeFunction(df)  # Does SomeFunction() exist?
>>> df_oh
   A_0 A_1 A_2 B_0 B_1
0  1   0   0   1   0
1  0   1   0   1   0
2  1   0   0   0   1
3  0   0   1   1   0

Thanks!

CodePudding user response:

You can use pd.get_dummies() after converting df to string dtypes with DataFrame.astype(), as follows:

pd.get_dummies(df.astype(str))

Result:

   A_0  A_1  A_2  B_0  B_1
0    1    0    0    1    0
1    0    1    0    1    0
2    1    0    0    0    1
3    0    0    1    1    0
  • Related