Home > Enterprise >  Pandas: pivot comma delimited column into multiple columns
Pandas: pivot comma delimited column into multiple columns

Time:03-24

I have the following Pandas DataFrame:

import pandas as pd
import numpy as np

df = pd.DataFrame({'id': [1, 2, 3, 4], 'type': ['a,b,c,d', 'b,d', 'c,e', np.nan]})

enter image description here

I need to split the type column based on the commma delimiter and pivot the values into multiple columns to get this

enter image description here

I looked at Pandas documentation for pivot() and also searched stackoverflow. I did not find anything that seems to achieve (directly or indirectly) what I need to do here. Any suggestions?

CodePudding user response:

You could use str.get_dummies to get the dummy variables; then join back to df:

out = df[['id']].join(df['type'].str.get_dummies(sep=',').add_prefix('type_').replace(0, float('nan')))

Output:

   id  type_a  type_b  type_c  type_d  type_e
0   1     1.0     1.0     1.0     1.0     NaN
1   2     NaN     1.0     NaN     1.0     NaN
2   3     NaN     NaN     1.0     NaN     1.0
3   4     NaN     NaN     NaN     NaN     NaN
  • Related