Home > other >  Convert single column with commas to multiple columns [duplicate]
Convert single column with commas to multiple columns [duplicate]

Time:10-09

I have a dataframe:

df = pd.DataFrame(np.array([['bob, sam, manny'], ['bob (a description, of some sort), marry, rob']]), columns=['target'])

                                          target
0                                bob, sam, manny
1  bob (a description, of some sort), marry, rob

I want to convert column target to multiple columns using the comma as the separator. I want it to look like this:

                                          target  a                                 b       c 
0                                bob, sam, manny  bob                               sam     manny 
1  bob (a description, of some sort), marry, rob  bob (a description, of some sort) marry   rob

So far, I was able to do this: df[["a", "b", "c", "d"]] = df["target"].str.split(pat=",", expand=True)

                                          target                   a               b       c     d
0                                bob, sam, manny                 bob             sam   manny  None
1  bob (a description, of some sort), marry, rob  bob (a description   of some sort)   marry   rob

But this recognizes the comma within the () as a separator. How do I ignore commas within ()'s?

CodePudding user response:

You can use regex to split on commas except when between parentheses:

df = pd.DataFrame(np.array([['bob, sam, manny'], ['bob (a description, of some sort), marry, rob']]), columns=['target'])
df[["a", "b", "c"]] = df["target"].str.split(r'\,\s*(?![^()]*\))', expand=True)

Output:

target a b c
0 bob, sam, manny bob sam manny
1 bob (a description, of some sort), marry, rob bob (a description, of some sort) marry rob
  • Related