I have a dataset with the following setup:
id string_to_parse
1 "a","b"
2 "a,b","c"
3 "c"
I need to get it into this
id string_to_parse
1 a
1 b
2 a,b
2 c
3 c
I tried with
exploded_ = df['string_to_parse'].map(lambda x:x\
.replace('"','')\
.split(",")).explode()
Besides being very slow, it also misses the "a,b"
and splits them also.
CodePudding user response:
Use Series.str.strip
with Series.str.split
and last DataFrame.explode
:
df['string_to_parse'] = df['string_to_parse'].str.strip('"').str.split('","')
df = df.explode('string_to_parse')
print (df)
id string_to_parse
0 1 a
0 1 b
1 2 a,b
1 2 c
2 3 c