I have an array of countries. I would like to run this array through a function and append the output of the function as a column to a dataframe.
I used the apply
method but keep getting a KeyError
. I am not sure what I am doing wrong.
Code
import matplotlib.pyplot as plt
import pandas as pd
import pycountry_convert as pc
data - pd.read_csv('/content/2019.csv', index_col=0)
data.loc[71, 'Country or region'] = 'Trinidad and Tobago'
country_region = data['Country or region']
for country in country_region:
country_code = pc.country_name_to_country_alpha2(country)
data['Continent'].apply(pc.country_alpha2_to_continent_code(country_code))
Here is a screenshot of my error, for more details.
CodePudding user response:
Yes, you can use a function. You were almost there. If your function takes a single argument, and you are applying to a column, there's no need to add the argument. If you want to use multiple args, you can combine apply and lambda.
No need for lambda in your case, This should solve it:
data = pd.read_csv('/content/2019.csv', index_col=0)
data.loc[71, 'Country or region'] = 'Trinidad and Tobago'
data['country_code'] = data['Country or region'].apply(pc.country_name_to_country_alpha2)
data['Continent'] = data.country_code.apply(pc.country_alpha2_to_continent_code)
For future references, you can copy the code in your question instead of attaching a screenshot.