I have 5 columns with string values separated by comma. I want to split the column into required columns. For example I have two columns with entries (strings) below.
col1 col2
a,b,c,d e,f,g,h,i
c,d h,i
My required cols are:
cola colb colc cold cole colf colg colh coli
a b c d e f g h i
c d h i
I can split the columns and I have done coding, but I don't know how to do it the above way. Secondly, I want to split multiple column in one line code.
df1 = df2['col1'].str.split(',' , expand=True) # This only split one column col1.
Thank you for your help.
CodePudding user response:
You can use str.get_dummies
for each column and concatenate the result with pandas.concat
:
out = (pd
.concat({k: df[k].str.get_dummies(sep=',') for k in df.columns}, axis=1)
.pipe(lambda d: d.mul(d.columns.get_level_values(1))
.set_axis(map('_'.join, d.columns), axis=1)
)
)
output:
col1_a col1_b col1_c col1_d col2_e col2_f col2_g col2_h col2_i
0 a b c d e f g h i
1 c d h i