I have a dataframe which looks as following:
df['col1'].values
array(['cat 113kd29', 'do56goat24kdasd', 'pig145kd'])
I need to create a new column df['vals']
with following values:
cat 29
do56goatasd
pig
i.e. first I need to look for substring kd
and then find the numeric value preceding it. I am not sure how to go about this.
There can be multiple numeric values in each string so I need to find only ones before kd
. Please note the string 'cat 113kd29'. Also look at 'do56goat24kdasd'
I tried the following but it didn't work:
df['col1'].str.replace(r'(\d )kd', '')
CodePudding user response:
Your call to str.replace
is correct, but you need to assign it to the original Pandas column on the left hand side of an assignment:
df["col1"] = df["col1"].str.replace(r'\d kd', '')
Note that str.replace
does a global replacement by default, so there is no need to use any sort of flag.
CodePudding user response:
Another way is to match digits precedingkd
and kd
and replace it with nothing
df["col1"]=df.col1.str.replace('\d kd\Z','', regex=True)