Convert column value to column name in pandas-CodePudding

col a
[test1,test2,test3]
[test3,test4,test5']

Output:

col a	a_test1	a_test2	a_test3	a_test4	a_test5
[test1,test2,test3]	1	1	1	0	0
[test3,test4,test5']	0	0	1	1	1

CodePudding user response：

If col a contains lists of strings, you can do:

tmp = df.explode("col a").reset_index()
x = df.merge(
    pd.crosstab(tmp["index"], tmp["col a"]).add_prefix("a_"),
    left_index=True,
    right_index=True,
)
print(x)

Prints:

                   col a  a_test1  a_test2  a_test3  a_test4  a_test5
0  [test1, test2, test3]        1        1        1        0        0
1  [test3, test4, test5]        0        0        1        1        1

CodePudding user response：

Assuming the cells in col a are actually lists of strings, you can try something like this:

new_df = pd.concat([df, pd.concat([df[col].str.join(';').str.get_dummies(';').add_prefix(col.split('_')[1]   '_') for col in df.columns])], axis=1)

Output:

>>> new_df
                   col a  a_test1  a_test2  a_test3  a_test4  a_test5
0  [test1, test2, test3]        1        1        1        0        0
1  [test3, test4, test5]        0        0        1        1        1