Home > Net >  Pandas: Change apply for complex indexing
Pandas: Change apply for complex indexing

Time:03-11

I have the following code with an apply, which accesses another df to get some data and multiply it.

def _apply_values(row):
    if row['symbol'] in df_fiat.index:
        row['usdValue'] = row['tokenBalanceFloat'] * df_fiat.loc[row['symbol'], 'USD']
        row['ethValue'] = row['tokenBalanceFloat'] * df_fiat.loc[row['symbol'], 'ETH']
        row['eurValue'] = row['tokenBalanceFloat'] * df_fiat.loc[row['symbol'], 'EUR']
    else:
        row['usdValue'] = np.NaN
        row['ethValue'] = np.NaN
        row['eurValue'] = np.NaN
    
    return row

df = df.apply(_apply_values, axis='columns')

I think this could be done without the apply (and thus, maybe more efficient and readable), but I don't know how:

My idea:

  1. Assign all the columns to NaN: df[['usdValue', 'ethValue', 'eurValue']] = np.NaN
  2. Use boolean masking to change only desired rows: mask = df['symbol'].isin(df_fiat.index)
  3. Now multiply the two things: df[masked][['usdValue', 'ethValue', 'eurValue']] = df[masked]['tokenBalanceFloat'] * df_fiat.loc[df[masked]['symbol'], [['USD', 'ETH', 'EUR']

But it doesn't work. How can I translate it to working code?

CodePudding user response:

You can use transpose and merge , try something like this :

# transpose df

df = df.T


# merge df with df_fiat

merged_df = df.merge(df_fiat , how = 'left' , on='symbol')

# Calcul 
merged_df['usdValue'] = merged_df[['tokenBalanceFloat' , 'USD']].apply(lambda x : x[0]*x[1] if x[1]!=np.NaN else np.NaN , axis = 1)
merged_df['ethValue'] = merged_df[['tokenBalanceFloat' , 'ETH']].apply(lambda x : x[0]*x[1] if x[1]!=np.NaN else np.NaN , axis = 1)
merged_df['eurValue'] = merged_df[['tokenBalanceFloat' , 'EUR']].apply(lambda x : x[0]*x[1] if x[1]!=np.NaN else np.NaN , axis = 1)

# transpose merged_df
df = merged_df.T

CodePudding user response:

It ended up being possible using df.loc on the left side of the assignment, and using reset_index and to_numpy() to ignore the indexes.

cols = ['USD', 'ETH', 'EUR']
df[cols] = np.NaN
mask = df['symbol'].isin(df_fiat.index)
df.loc[mask,cols] = df_fiat.loc[df[mask]['symbol'], cols].reset_index(drop=True).mul(df[mask]['tokenBalanceFloat'].reset_index(drop=True), axis=0).to_numpy()

df = df.rename(columns={
    'USD': 'usdValue',
    'ETH': 'ethValue',
    'EUR': 'eurValue'
})
  • Related