I have a DataFrame with 2 columns total_open_amount
and invoice_currency
.
invoice_currency
has
USD 45011
CAD 3828
Name: invoice_currency, dtype: int64
And I want to convert all the CAD to USD from the total_open_amount column wrt to invoice_currency with an exchange rate of 1 CAD = 0.7USD and store them in a separate column.
My code:
df_data['converted_usd'] = df_data['total_open_amount'].where(df_data['invoice_currency']=='CAD')
df_data['converted_usd']= df_data['converted_usd'].apply(lambda x: x*0.7)
df_data['converted_usd']
output:
0 NaN
1 NaN
2 NaN
3 2309.79
4 NaN
...
49995 NaN
49996 NaN
49997 NaN
49998 NaN
49999 NaN
Name: converted_usd, Length: 48839, dtype: float64
I was able to fill the new column with CAD values converted but how do I fill the rest of the USD values now?
CodePudding user response:
We can use Series.mask
or Series.where
, series.mask
set to NaN the rows where 'invoice_currency'
is USD, but with the other parameter we tell it that these values have to be filled with df_data['total_open_amount']
series multiplied by 0.7.
using serie.where
the rows that do not meet the condition are set to NaN, so first we multiply the series by 0.7 and leave only the rows where the condition is met, that is, the rows with USD currency and we use other parameter to leave the rest of rows with initial value
Note that series.mask and series.where are the opposite of each other.
df_data['converted_usd'] = df_data['total_open_amount']\
.mask(df_data['invoice_currency'] == 'CAD',
other=df_data['total_open_amount'].mul(0.7))
Or:
df_data['converted_usd'] = df_data['total_open_amount'].mul(0.7)\
.where(df_data['invoice_currency'] == 'CAD',
df_data['total_open_amount'])
numpy version
df_data['converted_usd'] = \
np.where(df_data['invoice_currency'] == 'CAD',
df_data['total_open_amount'].mul(0.7),
df_data['total_open_amount'])