I have a dataframe as follows:
df =
col_1 val_1
0 4.0 0.89
1 4.0 0.56
2 49.0 0.7
3 49.0 1.23
4 49.0 0.8
5 52.0 0.5
6 52.0 0.2
I want to calculate the sum of the column val_1
with a penalising factor which depends on the change in the values of col_1
.
For example: If there is a change in the value in col_1
, then we take the value from previous row in val_1 and subtract with a penalising factor of 0.4
sum = 0.89 (0.56-0.4) (because there is change of value in col_1
from 4.0 to 49.0) 0.7 1.23 (0.8 - 0.4) (because there is a change of value in col_1
from 49.0 to 52.0) 0.5 0.2
sum = 4.08
Is there a way to do this?
CodePudding user response:
use np.where
to assign a new column and measure changes with .shift()
against each row.
import numpy as np
df['val_1_adj'] = np.where(df['col_1'].ne(df['col_1'].shift(-1).ffill()),
df['val_1'].sub(0.4),
df['val_1'])
print(df)
col_1 val_1 val_1_adj
0 4.0 0.89 0.89
1 4.0 0.56 0.16
2 49.0 0.70 0.70
3 49.0 1.23 1.23
4 49.0 0.80 0.40
5 52.0 0.50 0.50
6 52.0 0.20 0.20
df['val_1_adj'].sum()
4.08
CodePudding user response:
Slight variation on @UmarH's answer
df['penalties'] = np.where(~df.col_1.diff(-1).isin([0, np.nan]), 0.4, 0)
my_sum = (df['val_1'] - df['penalties']).sum()
print(my_sum)
Output:
4.08