col a |
---|
[test1,test2,test3] |
[test3,test4,test5'] |
Output:
col a | a_test1 | a_test2 | a_test3 | a_test4 | a_test5 |
---|---|---|---|---|---|
[test1,test2,test3] | 1 | 1 | 1 | 0 | 0 |
[test3,test4,test5'] | 0 | 0 | 1 | 1 | 1 |
CodePudding user response:
If col a
contains lists of strings, you can do:
tmp = df.explode("col a").reset_index()
x = df.merge(
pd.crosstab(tmp["index"], tmp["col a"]).add_prefix("a_"),
left_index=True,
right_index=True,
)
print(x)
Prints:
col a a_test1 a_test2 a_test3 a_test4 a_test5
0 [test1, test2, test3] 1 1 1 0 0
1 [test3, test4, test5] 0 0 1 1 1
CodePudding user response:
Assuming the cells in col a
are actually lists of strings, you can try something like this:
new_df = pd.concat([df, pd.concat([df[col].str.join(';').str.get_dummies(';').add_prefix(col.split('_')[1] '_') for col in df.columns])], axis=1)
Output:
>>> new_df
col a a_test1 a_test2 a_test3 a_test4 a_test5
0 [test1, test2, test3] 1 1 1 0 0
1 [test3, test4, test5] 0 0 1 1 1