I have a dataframe like below: Original data
index string
0 a,b,c,d,e,f
1 a,b,c,d,e,f
2 a,(I,j,k),c,d,e,f
I want to be: To be data
index col1 col2 col3 col4 col5 col6
0 a b c d e f
1 a b c d e f
2 a (I,j,k) c d e f
CodePudding user response:
You can split on commas that are not inside brackets. Then convert the result to a DataFrame and assign to df
columns:
df[['col {}'.format(i) for i in range(1,7)]] = df['string'].str.split(r",\s*(?![^()]*\))").apply(pd.Series)
Output:
index string col 1 col 2 col 3 col 4 col 5 col 6
0 0 a,b,c,d,e,f a b c d e f
1 1 a,b,c,d,e,f a b c d e f
2 2 a,(I,j,k),c,d,e,f a (I,j,k) c d e f
CodePudding user response:
Try this :
df = df['string'].str.split(r",\s*(?![^()]*\))", expand= True)
df.columns = ['col1','col2','col3','col4','col5','col6']