Home > Blockchain >  Splitting a column into two in dataframe
Splitting a column into two in dataframe

Time:10-01

It's solution is definitely out there but I couldn't find it. So posting it here.
I've a dataframe which is like

  object_Id object_detail
0     obj00       red mug
1     obj01      red bowl
2     obj02     green mug
3     obj03     white candle holder

I want to split object_details into two column name, object_color based on a list that contains color name

COLOR = ['red', 'green', 'blue', 'white']
print(df)
# want to perform some operation so that It'll get output 
  object_Id object_detail           object_color          name
0     obj00       red mug              red                 mug
1     obj01      red bowl              red                bowl
2     obj02     green mug              green               mug
3     obj03     white candle holder    white     candle holder

This is me first time using dataframe so not sure how to achieve it using pandas. I can achieve it by converting it into list and then apply filter. But I think easier ways is out there that I might miss. Thanks

CodePudding user response:

Use Series.str.extract with joined values of list by | for regex OR and then all another values in new column splitted by space:

pat = "|".join(COLOR)
df[['object_color','name']] = df['object_detail'].str.extract(f'({pat})\s (.*)',expand=True)
print (df)
  object_Id        object_detail object_color           name
0     obj00    Barbie Pink frock  Barbie Pink          frock
1     obj01             red bowl          red           bowl
2     obj02            green mug        green            mug
3     obj03  white candle holder        white  candle holder
  • Related