I have a dataframe
city = pd.DataFrame({'id': [1,2,3,4],
'city': ['NRTH CAROLINA','NEW WST AMSTERDAM','EAST TOKYO','LONDON STH']})
How can I change NRTH to NORTH, WST to WEST, and STH to SOUTH, so the output will be like this
id city
1 NORTH CAROLINA
2 NEW WEST AMSTERDAM
3 EAST TOKYO
4 LONDON STH
CodePudding user response:
Hello, Arthur! I have defined mapping_dict where you can define any other words that you want to change.
For changing them I made a separate function for mapping city names.
import pandas as pd
city = pd.DataFrame({'id': [1,2,3,4],
'city': ['NRTH CAROLINA','NEW WST AMSTERDAM','EAST TOKYO','LONDON STH']})
mapping_dict = {'NRTH':'NORTH','WST':'WEST','STH':'SOUTH'}
def mapping_words(city_name):
updated_name = ""
for word in city_name.split():
if word in mapping:
updated_name = mapping[word] " "
else:
updated_name = word " "
return updated_name.strip()
city['city'] = city['city'].apply(lambda x: mapping_words(x))
I hope this may help you. Thanks!
CodePudding user response:
Let's define a replace dictionary first then use Series.replace(regex=True)
to replace by the word boundary of the dictionary key.
import re
d = {
'NRTH': 'NORTH',
'WST': 'WEST',
'STH': 'SOUTH'
}
df['city'] = df['city'].replace({rf"\b{re.escape(k)}\b":v for k, v in d.items()}, regex=True)
print(df)
id city
0 1 NORTH CAROLINA
1 2 NEW WEST AMSTERDAM
2 3 EAST TOKYO
3 4 LONDON SOUTH