How to replace exact string to other using replace() of Panda.DataFrame?-CodePudding

I'd like to replace all '0-4' to '00-04' in 'tumor-size' column in my DataFrame. What I have in the column is following.

print(df['tumor-size'].unique())
["'15-19'" "'35-39'" "'30-34'" "'25-29'" "'40-44'" "'10-14'" "'0-4'" "'20-24'" "'45-49'" "'50-54'" "'5-9'"]

What I tried at 1st place and nothing changed is following.

df['tumor-size'] = df['tumor-size'].replace('0-4', '00-04')

Next, I tried is following. In this case, all '0-4' were replaced w/ '00-40', however all '40-44' were replaced w/ '400-044' since '40-44' contains '0-4'.

df['tumor-size'] = df['tumor-size'].str.replace('0-4', '00-04')

I read other QAs and noticed me that I need regex. Then I tried following since the elements always start with '0-4', but nothing changed again.

df['tumor-size'] = df['tumor-size'].str.replace(r'^0-4', '00-04', regex=True)

What I want to do is quite simple but I have no idea how to realize this. Please someone help me. Thank you,

Note: I reload all data to df from csv file at Every single try.

CodePudding user response：

Try:

df['tumor-size'] = df['tumor-size'].replace("^'0-4'$", "'00-04'")

CodePudding user response：

You can use $:

df = pd.DataFrame(data={'tumor-size': ['15-19', '35-39', '30-34', '25-29',
                                       '40-44', '10-14', '0-4', '20-24',
                                       '45-49', '50-54', '5-9']})
df['tumor-size'] = df['tumor-size'].str.replace(r'^0-4$', '00-04', regex=True)

Output:

   tumor-size
0       15-19
1       35-39
2       30-34
3       25-29
4       40-44
5       10-14
6       00-04
7       20-24
8       45-49
9       50-54
10        5-9