Home > OS >  Faster way to operate columns with if conditions
Faster way to operate columns with if conditions

Time:12-08

I need to operate a column with an IF as shown in my code. It takes quite a time to compute, is there a faster, cleaner way to do this?

For reference, the column "coin" have pairs like "ETH_ARS", "DAI_USD" and so on, that´s why I split it.

for i in range(merged.shape[0]):
    x = merged["coin"].iloc[i]
    if x.split("_")[1] == "ARS":
        merged["total"].iloc[i] = (
            merged["price"].iloc[i]
            * merged["amount"].iloc[i]
            / merged["valueUSD"].iloc[i]
        )
    else:
        merged["total"].iloc[i] = merged["price"].iloc[i] * merged["amount"].iloc[i]

CodePudding user response:

You can vectorize your code. The trick here is to set valueUSD=1 when coin column ends with USD. After that the operation is the same for all rows: total = price * amount / valueUSD.

Setup a MRE:

data = {'coin': ['ETH_ARS', 'DAI_USD'],
        'price': [10, 12],
        'amount': [3, 4],
        'valueUSD': [2, 7]}
df = pd.DataFrame(data)
print(df)

# Output:
      coin  price  amount  valueUSD
0  ETH_ARS     10       3         2
1  DAI_USD     12       4         7  # <- should be set to 1 for division
valueUSD = df['valueUSD'].mask(df['coin'].str.split('_').str[1].eq('USD'), other=1)
df['total'] = df['price'] * df['amount'] / valueUSD
print(df)

# Output:
      coin  price  amount  valueUSD  total
0  ETH_ARS     10       3         2   15.0  # = 10 * 3 / 2
1  DAI_USD     12       4         7   48.0  # = 10 * 3 / 1    (7 -> 1)

To do that, use mask and replace NaN by 1 instead of the valueUSD:

>>> valueUSD
0    2
1    1  # 7 -> 1
Name: valueUSD, dtype: int64
  • Related