Home > database >  Create new columns from categorical variables
Create new columns from categorical variables

Time:10-30

ID column_factors column1 column2
0 fact1 d w
1 fact1, fact2 a x
2 fact3 b y
3 fact1,fact4 c z

I have a table in pandas dataframe. What I would like create is, removing column "column_factors" and create new columns called "fact1", "fact2", "fact3", "fact4". And filling the new columns with dummy values as shown below. Thanks in advance,

ID fact1 fact2 fact3 fact4 column1 column2
0 1 0 0 0 d w
1 1 1 0 0 a x
2 0 0 1 0 b y
3 1 0 0 1 c z

CodePudding user response:

Use Series.str.get_dummies

https://pandas.pydata.org/docs/reference/api/pandas.Series.str.get_dummies.html#pandas.Series.str.get_dummies

dummy_cols =  df['column_factors'].str.get_dummies(sep=',')
df = df.join(dummy_cols).drop(columns='column_factors')
  • Related