I have a dataframe with a column in it containing state names. The names are a mix of US states abbreviations and other countries state name. And, I only want to change the us state names and leave others as it is.
Data
Country State
1 United States MI
2 United States PA
3 New Zealand Auckland
4 France Île-de-France
5 United States FL
I have tried this code:
states = {
'AK': 'Alaska',
'AL': 'Alabama',
'AR': 'Arkansas',
'AS': 'American Samoa',
'AZ': 'Arizona',
'CA': 'California',
'CO': 'Colorado',
'CT': 'Connecticut',
'DC': 'District of Columbia',
'DE': 'Delaware',
.
.
.
}
states = {state: abbrev for state, abbrev in states.items()}
Data['State_full'] = Data['State'].map(states)
It replaces the US states as it should but it also replaces the other countries state name with None. What am i missing? Thanks in advance.
CodePudding user response:
You can use Series.replace
. It can take a dict, where the keys of the dict are values to find, and the values of the dict are the replacements. If a value isn't found in the dict, it will be left as-is.
df['State'] = df['State'].replace(states)
Output:
>>> df
Country State
1 United States Minnesota
2 United States Pennsylvania
3 New Zealand Auckland
4 France Île-de-France
5 United States Florida