Home > database >  Calculation on percentage of a percentage
Calculation on percentage of a percentage

Time:03-24

I am trying to calculate percent of percent. per, per1, and per1-per are in percentages

date      per    per1   per1-per
1/1/2021   0      0        0
2/1/2021   0      0        0
3/1/2021   2      3        1
4/1/2021   0      1        1
5/1/2021   5      1       -4
6/1/2021   0      0        0

my desired output would be

date      per    per1   per1-per  per_of_per
1/1/2021   0      0        0          0
2/1/2021   0      0        0          0
3/1/2021   2      3        1          1 
4/1/2021   0      1        1          1.01
5/1/2021   5      1       -4          0.97
6/1/2021   0      0        0          0.97

What I'm trying if difference between per1 and per is non zero then I am trying get overall impact in column per_of_per in generic manner.

p.s.

  • if per1-per = 0 then per_of_per = 0
  • for the first non-zero value, per1-per = per_of_per
  • Afterward if per1-per = 0, most recent per_of_per value if not then per_of_per = most recent per_of_per ((per1-per)/100)

CodePudding user response:

Here's another way. First create a boolean Series msk that tracks whether per1-per is 0 or not (after the first occurrence of non-zero, flag the rest as non-zero). Then use it to assign 1 to the first non-zero value and divide the remaining by 100; finally, find cumulative sum:

msk = df['per1-per'].ne(0).cummax()
df['per_of_per'] = (msk.where(lambda x: x.index==x.idxmax(), False)   
                    df['per1-per'] * msk.shift().fillna(False) / 100).cumsum()

Output:

       date  per  per1  per1-per  per_of_per
0  1/1/2021    0     0         0        0.00
1  2/1/2021    0     0         0        0.00
2  3/1/2021    2     3         1        1.00
3  4/1/2021    0     1         1        1.01
4  5/1/2021    5     1        -4        0.97
5  6/1/2021    0     0         0        0.97

CodePudding user response:

If I understand correctly, this should work for you:

g = (df['per1-per'].shift(1).eq(0) | df['per1-per'].eq(0)).cumsum()
df['per_of_per'] = df['per1-per'].replace(0, np.nan).ffill().fillna(0)
df['per_of_per'] = df.groupby(g)['per1-per'].transform('first')   df['per1-per'].mul(df['per1-per'].shift(1).ne(0).astype(int)).div(100).groupby(g).cumsum()

Output:

>>> df
       date  per  per1  per1-per  per_of_per
0  1/1/2021    0     0         0        0.00
1  2/1/2021    0     0         0        0.00
2  3/1/2021    2     3         1        1.00
3  4/1/2021    0     1         1        1.01
4  5/1/2021    5     1        -4        0.97
5  6/1/2021    0     0         0        0.97
  • Related