I need to operate a column with an IF as shown in my code. It takes quite a time to compute, is there a faster, cleaner way to do this?
For reference, the column "coin" have pairs like "ETH_ARS", "DAI_USD" and so on, that´s why I split it.
for i in range(merged.shape[0]):
x = merged["coin"].iloc[i]
if x.split("_")[1] == "ARS":
merged["total"].iloc[i] = (
merged["price"].iloc[i]
* merged["amount"].iloc[i]
/ merged["valueUSD"].iloc[i]
)
else:
merged["total"].iloc[i] = merged["price"].iloc[i] * merged["amount"].iloc[i]
CodePudding user response:
You can vectorize your code. The trick here is to set valueUSD=1
when coin
column ends with USD
. After that the operation is the same for all rows: total = price * amount / valueUSD
.
Setup a MRE:
data = {'coin': ['ETH_ARS', 'DAI_USD'],
'price': [10, 12],
'amount': [3, 4],
'valueUSD': [2, 7]}
df = pd.DataFrame(data)
print(df)
# Output:
coin price amount valueUSD
0 ETH_ARS 10 3 2
1 DAI_USD 12 4 7 # <- should be set to 1 for division
valueUSD = df['valueUSD'].mask(df['coin'].str.split('_').str[1].eq('USD'), other=1)
df['total'] = df['price'] * df['amount'] / valueUSD
print(df)
# Output:
coin price amount valueUSD total
0 ETH_ARS 10 3 2 15.0 # = 10 * 3 / 2
1 DAI_USD 12 4 7 48.0 # = 10 * 3 / 1 (7 -> 1)
To do that, use mask
and replace NaN
by 1 instead of the valueUSD
:
>>> valueUSD
0 2
1 1 # 7 -> 1
Name: valueUSD, dtype: int64