Home > database >  Multiply part of a dataframe by a series conditionally
Multiply part of a dataframe by a series conditionally

Time:11-05

I have a dataframe containing hard (USD) and soft currency (non-USD) financial data, and a dataframe containing exchange rate time series. I want to convert everything in USD.

EDIT: removed the DATES field for more clarity. Also, notice that 43646, etc. are dates.

TimeSerieLCY = pd.DataFrame({'ISIN':['ISIN1', 'ISIN2', 'ISIN3', 'ISIN4', 'ISIN5', 'ISIN6'],
                             '43646':[105.00, 95.600, 105.05, 106.60, 104.99, 98.465],
                             '43676':[103.68, 98.090, 106.60, 106.81, 103.96, 98.339],
                             '43707':[104.03, 101.84, 107.66, 107.22, 102.41, 96.323],
                             'FX':['USD', 'CNY', 'MYR', 'MYR', 'USD', 'ZAR']})
FXRatesSeries = pd.DataFrame({'Curncy':['CNY', 'MYR', 'ZAR'],
                              '43646':[6.905, 4.190, 19.62],
                              '43676':[6.867, 4.132, 19.15],
                              '43707':[6.884, 4.127, 19.15]})

This gives:

>>>   ISIN   '43646'  '43676'  '43707'  FX
   0  ISIN1  105.00   103.68   104.03   USD
   1  ISIN2  95.600   98.090   101.84   CNY
   2  ISIN3  105.05   106.60   107.66   MYR
   3  ISIN4  106.60   106.81   107.22   MYR
   4  ISIN5  104.99   103.96   102.41   USD
   5  ISIN6  98.465   98.339   96.323   ZAR

>>>   Curncy  '43646'  '43676'  '43707'
   0  CNY     6.905    6.867    6.884
   1  MYR     4.190    4.132    4.127
   2  ZAR     19.62    19.15    19.15

I tried the following:

First, create an empty DataFrame.

Transform the FXRatesSeries into a dictionnary {'CNY':[6.905, 6.867, 6.884], ...}.

Then, within a recursion, for each row of TimeSerieLCY:

  1. Get the FX to apply.
  2. If USD, add the row to the empty DataFrame.
  3. Else, create a new series that does not contain the ISIN, DATES, and FX fields.
  4. Multiply with the corresponding FXRates.
  5. Add the ISIN, DATES, and FX fields back, and add the row to the empty DataFrame.

I am sure there is something more straightforward, no ? But my issue so far has been the presence of the additional fields that are not the time series... I don't know if you may have a suggestion, I would be very grateful.

I tried to look to use .mul or .apply, as indicated in many threads, but each time I'm stuck with those annoying columns.

CodePudding user response:

I think this solves your problem

# First, rename the columns of FXRatesSeries dataframes. 
FXRatesSeries.columns = ['FX', 'DATES', '43646_rate', '43676_rate', '43707_rate']
# The column "FX" and "DATES" will be the keys to merge the two dataframes (left join to have for each column the rate associated depending on the currency)
df = pd.merge(TimeSerieLCY, FXRatesSeries, how = 'left', on = ['FX', 'DATES'])
# If the FX is USD, then rates are equal to 1
for rate_col in [col for col in df.columns if 'rate' in col]:
    df[rate_col] = df.apply(lambda x: 1 if x['FX'] == 'USD' else x[rate_col], axis=1)
# Define the function for the conversion
def convert_to_USD(x): 
    to_USD_43646 = x['43646'] * x['43646_rate']
    to_USD_43676 = x['43676'] * x['43676_rate']
    to_USD_43707 = x['43707'] * x['43707_rate']
    return to_USD_43646, to_USD_43676, to_USD_43707
# Then apply the conversion function to the dataframe
df['43646'], df['43676'], df['43707'] = zip(*df.apply(lambda x: convert_to_USD(x), axis = 1))
# The output is the dataframe without exchange rate columns or FX column
df = df[[j for j in df.columns if 'rate' not in j and 'FX' not in j]]
  • Related