I have a dataframe containing hard (USD) and soft currency (non-USD) financial data, and a dataframe containing exchange rate time series. I want to convert everything in USD.
EDIT: removed the DATES field for more clarity. Also, notice that 43646, etc. are dates.
TimeSerieLCY = pd.DataFrame({'ISIN':['ISIN1', 'ISIN2', 'ISIN3', 'ISIN4', 'ISIN5', 'ISIN6'],
'43646':[105.00, 95.600, 105.05, 106.60, 104.99, 98.465],
'43676':[103.68, 98.090, 106.60, 106.81, 103.96, 98.339],
'43707':[104.03, 101.84, 107.66, 107.22, 102.41, 96.323],
'FX':['USD', 'CNY', 'MYR', 'MYR', 'USD', 'ZAR']})
FXRatesSeries = pd.DataFrame({'Curncy':['CNY', 'MYR', 'ZAR'],
'43646':[6.905, 4.190, 19.62],
'43676':[6.867, 4.132, 19.15],
'43707':[6.884, 4.127, 19.15]})
This gives:
>>> ISIN '43646' '43676' '43707' FX
0 ISIN1 105.00 103.68 104.03 USD
1 ISIN2 95.600 98.090 101.84 CNY
2 ISIN3 105.05 106.60 107.66 MYR
3 ISIN4 106.60 106.81 107.22 MYR
4 ISIN5 104.99 103.96 102.41 USD
5 ISIN6 98.465 98.339 96.323 ZAR
>>> Curncy '43646' '43676' '43707'
0 CNY 6.905 6.867 6.884
1 MYR 4.190 4.132 4.127
2 ZAR 19.62 19.15 19.15
I tried the following:
First, create an empty DataFrame.
Transform the FXRatesSeries into a dictionnary {'CNY':[6.905, 6.867, 6.884], ...}
.
Then, within a recursion, for each row of TimeSerieLCY:
- Get the FX to apply.
- If USD, add the row to the empty DataFrame.
- Else, create a new series that does not contain the ISIN, DATES, and FX fields.
- Multiply with the corresponding FXRates.
- Add the ISIN, DATES, and FX fields back, and add the row to the empty DataFrame.
I am sure there is something more straightforward, no ? But my issue so far has been the presence of the additional fields that are not the time series... I don't know if you may have a suggestion, I would be very grateful.
I tried to look to use .mul or .apply, as indicated in many threads, but each time I'm stuck with those annoying columns.
CodePudding user response:
I think this solves your problem
# First, rename the columns of FXRatesSeries dataframes.
FXRatesSeries.columns = ['FX', 'DATES', '43646_rate', '43676_rate', '43707_rate']
# The column "FX" and "DATES" will be the keys to merge the two dataframes (left join to have for each column the rate associated depending on the currency)
df = pd.merge(TimeSerieLCY, FXRatesSeries, how = 'left', on = ['FX', 'DATES'])
# If the FX is USD, then rates are equal to 1
for rate_col in [col for col in df.columns if 'rate' in col]:
df[rate_col] = df.apply(lambda x: 1 if x['FX'] == 'USD' else x[rate_col], axis=1)
# Define the function for the conversion
def convert_to_USD(x):
to_USD_43646 = x['43646'] * x['43646_rate']
to_USD_43676 = x['43676'] * x['43676_rate']
to_USD_43707 = x['43707'] * x['43707_rate']
return to_USD_43646, to_USD_43676, to_USD_43707
# Then apply the conversion function to the dataframe
df['43646'], df['43676'], df['43707'] = zip(*df.apply(lambda x: convert_to_USD(x), axis = 1))
# The output is the dataframe without exchange rate columns or FX column
df = df[[j for j in df.columns if 'rate' not in j and 'FX' not in j]]