Home > OS >  How to replace Pandas column values based on dict in Python?
How to replace Pandas column values based on dict in Python?

Time:06-21

I have the following Pandas DF:

ID Country
----------
01 "it"
02 "es"
03 "de"
04 "ch"
05 "in"
06 "ca"

where I want to replace the 2-letter country codes to the appropriate continent name like this:

ID Country
----------
01 "europe"
02 "europe"
03 "europe"
04 "asia"
05 "asia"
06 "america"

I have collected a dict with keys as continent name and values as list of country codes belonging the respective continents:

> country_dict

{'europe': ['it', 'es', 'de', 'gb'],
 'asia': ['in', 'ch', 'ru'],
 'america': ['us', 'ca']}

The best I could do so far:

for continent in country_dict.keys():
   df.Country.replace(country_dict[continent], continent)

but this seems somewhat less elegant. Any better idea?

CodePudding user response:

Your dict is backwards.

>>> import pandas as pd
>>> df = pd.DataFrame(['it', 'es'], columns=['Country'])
>>> df
  Country
0      it
1      es
>>> country_dict = {'europe': ['it', 'es', 'de', 'gb'],
 'asia': ['in', 'ch', 'ru'],
 'america': ['us', 'ca']}
>>> country_dict = {v: k for k, vs in country_dict.items() for v in vs}
>>> country_dict
{'it': 'europe', 'es': 'europe', 'de': 'europe', 'gb': 'europe', 'in': 'asia', 'ch': 'asia', 'ru': 'asia', 'us': 'america', 'ca': 'america'}
>>> df.replace(country_dict)
  Country
0  europe
1  europe

CodePudding user response:

This can be tricky that changing country_dict base value and key then using pandas.Series.map:

>>> dct = {v:k  for k,val in country_dict.items() for v in val}
>>> df['Country'] = df['Country'].map(dct)
>>> df

    Country
0   europe
1   europe
2   europe
3     asia
4     asia
5  america
  • Related