I have a dataframe province like this :
province = {'province':['Prov. Jawa Barat', 'JAWA BARAT', 'Prop. Jawa Barat', 'Prov. Sumarta Selatan', 'SUMARTA SELATAN', 'Prop. Sumatra Selatan'],
'city':['Bandung', 'Bogor', 'Cimahi', 'Palembang', 'Solo', 'Cilacap']}
df_prov = pd.DataFrame(province)
However, the names of the provinces do not have the same rules. So how to change the name of the province with the prefix 'Prov' and 'Prop' to JAWA BARAT and SUMARTA SELATAN?
Sorry I don't speak English very well Thanks
CodePudding user response:
You may try simply removing the prefix and uppercasing what remains:
df_prov["province"] = df_prov["province"].str.replace(r'^Pro[pv]\. ', '').str.upper()