Home > database >  Find and replace numeric values preceding a substring [closed]
Find and replace numeric values preceding a substring [closed]

Time:09-21

I have a dataframe which looks as following:

df['col1'].values

array(['cat 113kd29',     'do56goat24kdasd',     'pig145kd'])

I need to create a new column df['vals'] with following values:

cat 29
do56goatasd
pig

i.e. first I need to look for substring kd and then find the numeric value preceding it. I am not sure how to go about this.

There can be multiple numeric values in each string so I need to find only ones before kd. Please note the string 'cat 113kd29'. Also look at 'do56goat24kdasd'

I tried the following but it didn't work:

df['col1'].str.replace(r'(\d )kd', '')

CodePudding user response:

Your call to str.replace is correct, but you need to assign it to the original Pandas column on the left hand side of an assignment:

df["col1"] = df["col1"].str.replace(r'\d kd', '')

Note that str.replace does a global replacement by default, so there is no need to use any sort of flag.

CodePudding user response:

Another way is to match digits precedingkd and kd and replace it with nothing

df["col1"]=df.col1.str.replace('\d kd\Z','', regex=True)
  • Related