Home > OS >  How to extract specific string data from a pandas dataframe
How to extract specific string data from a pandas dataframe

Time:02-11

I have this dataframe which need to extract package info (ML, KG, PZA, LT, UN, etc) from description column, and i'm pretty new at pandas. This is the dataframe right now

SKU Description
1 TRIDENT 6S SANDIA 9GR
2 CANAST RABBIT F1 A 1UN
3 HAND SOAP VITAMIN E 442 ML.

I need to extract 9GR, 1UN, 442 ML, etc. and take it into another column, there is any idea. I really appreciate this. Greetings

CodePudding user response:

You can use this regex:

pkg = ['ML', 'KG', 'PZA', 'LT', 'UN', 'GR']

df['package'] = df['Description'].str.extract(fr"\b(\d \s*(?:{'|'.join(pkg)}))\b")
print(df)

# Output
   SKU                  Description package
0    1        TRIDENT 6S SANDIA 9GR     9GR
1    2       CANAST RABBIT F1 A 1UN     1UN
2    3  HAND SOAP VITAMIN E 442 ML.  442 ML
  • Related