I downloaded and imported a csv file from an economics research database which contains three columns (screenshot of the table): "Country-Code", "Time", "Indicator". There are basically two types of indicators (1. amount in local currency and 2. EUR exchange rate). How can I create a new column "EUR_amount" in Python that divdes the amount with the rate in case the countrycode and the month is the same for both items, e.g. EUR = amount/rate where country and time matches?
Any help highly appreciated! (Please keep in mind that I am quite a noob with python and this is my first question on stackoverflow ever.) Thanks a lot in advance.
Edit: Adding this code after receiving feedback from mozway (thanks for that):
import pandas as pd
df = pd.DataFrame({'country_code':['EU','UK','US','EU','UK','US','EU','UK','US','EU','UK','US','EU','UK','US','EU','UK','US'],
'date':['2019-03','2019-03','2019-03','2019-04','2019-04','2019-04','2019-05','2019-05','2019-05','2019-03','2019-03','2019-03','2019-04','2019-04','2019-04','2019-05','2019-05','2019-05'],
'item':['exposure','exposure','exposure','exposure','exposure','exposure','exposure','exposure','exposure','FX-rate','FX-rate','FX-rate','FX-rate','FX-rate','FX-rate','FX-rate','FX-rate','FX-rate'],
'value':[15000,9000,13000,16500,8750,17000,17000,7999,25000,1.00,1.25,0.90,1,1.23,0.93,1.00,1.24,0.95]})
print(df)
So, to restate my question: How can I divide the item exposure with the item FX-rate under the condition of country_code AND date are matching?
CodePudding user response:
You can first split the data frames into two parts - exposure and FX-rate
fx = df[df["item"]=="FX-rate"]
exp = df[df["item"]!="FX-rate"]
After that, you can use
merged_df = pd.merge(fx,exp,on=["country_code","date"],how='outer')
See https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.merge.html for other arguments and examples.
The above will result in
country_code | date | item_x | value_x | item_y | value_y |
---|---|---|---|---|---|
EU | 2019-03 | FX-rate | 1.00 | exposure | 15000.0 |
UK | 2019-03 | FX-rate | 1.25 | exposure | 9000.0 |
US | 2019-03 | FX-rate | 0.90 | exposure | 13000.0 |
EU | 2019-04 | FX-rate | 1.00 | exposure | 16500.0 |
UK | 2019-04 | FX-rate | 1.23 | exposure | 8750.0 |
US | 2019-04 | FX-rate | 0.93 | exposure | 17000.0 |
EU | 2019-05 | FX-rate | 1.00 | exposure | 17000.0 |
UK | 2019-05 | FX-rate | 1.24 | exposure | 7999.0 |
US | 2019-05 | FX-rate | 0.95 | exposure | 25000.0 |
Next is just a matter of division
merged_df["Convert"] = merged_df["value_y"]/merged_df["value_x"]