Find and replace numeric values preceding a substring [closed]-CodePudding

I have a dataframe which looks as following:

df['col1'].values

array(['cat 113kd29',     'do56goat24kdasd',     'pig145kd'])

I need to create a new column df['vals'] with following values:

cat 29
do56goatasd
pig

i.e. first I need to look for substring kd and then find the numeric value preceding it. I am not sure how to go about this.

There can be multiple numeric values in each string so I need to find only ones before kd. Please note the string 'cat 113kd29'. Also look at 'do56goat24kdasd'

I tried the following but it didn't work:

df['col1'].str.replace(r'(\d )kd', '')

CodePudding user response：

Your call to str.replace is correct, but you need to assign it to the original Pandas column on the left hand side of an assignment:

df["col1"] = df["col1"].str.replace(r'\d kd', '')

Note that str.replace does a global replacement by default, so there is no need to use any sort of flag.

CodePudding user response：

Another way is to match digits precedingkd and kd and replace it with nothing

df["col1"]=df.col1.str.replace('\d kd\Z','', regex=True)