I have the following code with an apply
, which accesses another df to get some data and multiply it.
def _apply_values(row):
if row['symbol'] in df_fiat.index:
row['usdValue'] = row['tokenBalanceFloat'] * df_fiat.loc[row['symbol'], 'USD']
row['ethValue'] = row['tokenBalanceFloat'] * df_fiat.loc[row['symbol'], 'ETH']
row['eurValue'] = row['tokenBalanceFloat'] * df_fiat.loc[row['symbol'], 'EUR']
else:
row['usdValue'] = np.NaN
row['ethValue'] = np.NaN
row['eurValue'] = np.NaN
return row
df = df.apply(_apply_values, axis='columns')
I think this could be done without the apply
(and thus, maybe more efficient and readable), but I don't know how:
My idea:
- Assign all the columns to NaN:
df[['usdValue', 'ethValue', 'eurValue']] = np.NaN
- Use boolean masking to change only desired rows:
mask = df['symbol'].isin(df_fiat.index)
- Now multiply the two things:
df[masked][['usdValue', 'ethValue', 'eurValue']] = df[masked]['tokenBalanceFloat'] * df_fiat.loc[df[masked]['symbol'], [['USD', 'ETH', 'EUR']
But it doesn't work. How can I translate it to working code?
CodePudding user response:
You can use transpose and merge , try something like this :
# transpose df
df = df.T
# merge df with df_fiat
merged_df = df.merge(df_fiat , how = 'left' , on='symbol')
# Calcul
merged_df['usdValue'] = merged_df[['tokenBalanceFloat' , 'USD']].apply(lambda x : x[0]*x[1] if x[1]!=np.NaN else np.NaN , axis = 1)
merged_df['ethValue'] = merged_df[['tokenBalanceFloat' , 'ETH']].apply(lambda x : x[0]*x[1] if x[1]!=np.NaN else np.NaN , axis = 1)
merged_df['eurValue'] = merged_df[['tokenBalanceFloat' , 'EUR']].apply(lambda x : x[0]*x[1] if x[1]!=np.NaN else np.NaN , axis = 1)
# transpose merged_df
df = merged_df.T
CodePudding user response:
It ended up being possible using df.loc
on the left side of the assignment, and using reset_index
and to_numpy()
to ignore the indexes.
cols = ['USD', 'ETH', 'EUR']
df[cols] = np.NaN
mask = df['symbol'].isin(df_fiat.index)
df.loc[mask,cols] = df_fiat.loc[df[mask]['symbol'], cols].reset_index(drop=True).mul(df[mask]['tokenBalanceFloat'].reset_index(drop=True), axis=0).to_numpy()
df = df.rename(columns={
'USD': 'usdValue',
'ETH': 'ethValue',
'EUR': 'eurValue'
})