I have a column of School Name that contains both "Pri" and "Primary" (e.g. "XYZ Pri School", "DWQ Primary School") in a pandas dataframe.
I would like to replace the string "Pri" with "Primary" in the column to standardise the school names such that it only contains the full form of the word "Primary" (e.g. "XYZ Primary School", "DWQ Primary School").
I tried using df.["School Name"].str.replace("Pri", "Primary) but I also got output that extends the full form of the word (i.e. Primarymary) which i do not want.
Greatly appreciate your advice.
Thank you!
CodePudding user response:
Use word boundaries \b\b
:
print (df)
School Name
0 XYZ Pri School
1 DWQ Primary School
df["School Name"] = df["School Name"].str.replace(r"\bPri\b", "Primary")
print (df)
School Name
0 XYZ Primary School
1 DWQ Primary School