I have this dataframe which need to extract package info (ML, KG, PZA, LT, UN, etc) from description column, and i'm pretty new at pandas. This is the dataframe right now
SKU | Description |
---|---|
1 | TRIDENT 6S SANDIA 9GR |
2 | CANAST RABBIT F1 A 1UN |
3 | HAND SOAP VITAMIN E 442 ML. |
I need to extract 9GR, 1UN, 442 ML, etc. and take it into another column, there is any idea. I really appreciate this. Greetings
CodePudding user response:
You can use this regex:
pkg = ['ML', 'KG', 'PZA', 'LT', 'UN', 'GR']
df['package'] = df['Description'].str.extract(fr"\b(\d \s*(?:{'|'.join(pkg)}))\b")
print(df)
# Output
SKU Description package
0 1 TRIDENT 6S SANDIA 9GR 9GR
1 2 CANAST RABBIT F1 A 1UN 1UN
2 3 HAND SOAP VITAMIN E 442 ML. 442 ML