Home > Back-end >  Update column value if another column has a certain substring
Update column value if another column has a certain substring

Time:11-23

I'm facing a lot of trouble on what seems to be a simple matter:

I have a column with some beverages names, but they are poluted with "12oz" and "Boxes". I want to get only the name of the beverages. Unfortunally, they are not typed in the same particular form, so i cant just [0:5] them. I know all the beverages names on the column, if that helps

Ex: Column name: WHISKY BALLANTINES12YO 12X1000 ML RESTAGE

Column Created based on the previous one: BALLANTINES

Thanks in advance,

EDIT Some other examples:

CHIVAS REGAL 12 ANOS 12X1L should be CHIVAS

VODKA ABSOLUT 12X1000ML should be ABSOLUT

CodePudding user response:

Just use replace statements and set regex to true, and replace with an empty string, like this:

df.replace('12oz', '', regex=True)

This is assuming you know what text you will have to replace.

CodePudding user response:

If you have the list of all the beverages, you can use pandas.Series.extract :

import re
​
list_of_bvr= ["ballantines", "chivas", "absolut"]

df["Col1"]= df["Col1"].str.extract(f"({'|'.join(list_of_bvr)})", flags=re.I, expand=False)

# Output :

print(df)

          Col1
0  BALLANTINES
1       CHIVAS
2      ABSOLUT
  • Related