Home > Mobile >  perform calculation on column of a dataframe using data from another dataframe
perform calculation on column of a dataframe using data from another dataframe

Time:02-08

I have two dataframes, one called order_values & the other fx_rates. They look like below.

What I need to do is convert the value_loc value in the order_values dataframe into EUR's using the fx_rates dataframe.

In reality I won't know how many different currency there are & its likely to be more than shown below.

My current solution I believe is far from optimal. I loop through each different currency (i.e. USD, JPY etc) and merge that currency column (EURUSD, EURJPY) to the order_values dataframe & perform the calculation to convert the value_loc into value_eur & then drop the currency column from the order_values dataframe.

  order_values                             
  order_num   value_loc  currency   date
  1           2,345      USD        2-12-2021
  2           104        EUR        2-12-2021
  3           20,000     JPY        2-15-2021
  4           550        USD        3-06-2021
  

  fx_rates
  date       pair    rate
  2-12-2021  EURUSD  1.5
  2-12-2021  EURJPY  5
  2-12-2021  EUREUR  1
  ...
  3-06-2021  EURUSD  1.56
  3-06-2021  EURJPY  5.6
  3-06-2021  EUREUR  1

CodePudding user response:

You can merge by currency with dates and then multiple by new Series, which has same number of values like order_num, because used left join.

For currency helper column is stripped first 3 letters of column pair and column value_loc is converted to numeric.

df = fx_rates.assign(currency = fx_rates['pair'].str[3:])
df1 = order_values.assign(value_loc = order_values['value_loc'].str.replace(',','').astype(float))

s = df1.merge(df, how='left', on=['date','currency'])['rate']
df1['value_loc'] *= s
print (df1)
   order_num  value_loc currency       date
0          1     3517.5      USD  2-12-2021
1          2      104.0      EUR  2-12-2021
2          3        NaN      JPY  2-15-2021 <- no match create NaN
3          4      858.0      USD  3-06-2021

Details:

print (s)
0    1.50
1    1.00
2     NaN
3    1.56
Name: rate, dtype: float64

CodePudding user response:

FYI, a one-line version of the solution of @jezrael:

order_values['value_loc'] = (
    order_values['value_loc'].str.replace(',', '').astype(float)
    * order_values[['date', 'currency']].merge(fx_rates.assign(currency=fx_rates['pair'].str[3:]), how='left')['rate']
)
print(order_values)

# Output
   order_num  value_loc currency       date
0          1     3517.5      USD  2-12-2021
1          2      104.0      EUR  2-12-2021
2          3        NaN      JPY  2-15-2021
3          4      858.0      USD  3-06-2021

CodePudding user response:

I tried to reproduce the data you presented in the question.

you can find my solution below

import pandas as pd
fx_rates = pd.DataFrame({'date':['2-12-2021','2-12-2021','2-12-2021','3-06-2021','3-06-2021','3-06-2021'],
                         'pair':['EURUSD','EURJPY','EUREUR','EURUSD','EURJPY','EUREUR'],'rate':[1.5,5,1,1.56,5.6,1]})
order_values=pd.DataFrame({'order_num':[1,2,3,4],'value_loc':[2345,104,2000,550],  'currency':['USD','EUR','JPY','USD'], 
                           'date':['2-12-2021','2-12-2021','2-15-2021','3-06-2021']})

fx_rates['pair_split']=fx_rates['pair'].str.split('EUR',expand=True)[1].replace('','EUR')
df=pd.merge(order_values,fx_rates,right_on=['date','pair_split'],left_on=['date','currency'],how='inner')
  •  Tags:  
  • Related