I am trying to calculate percent of percent. per, per1, and per1-per are in percentages
date per per1 per1-per
1/1/2021 0 0 0
2/1/2021 0 0 0
3/1/2021 2 3 1
4/1/2021 0 1 1
5/1/2021 5 1 -4
6/1/2021 0 0 0
my desired output would be
date per per1 per1-per per_of_per
1/1/2021 0 0 0 0
2/1/2021 0 0 0 0
3/1/2021 2 3 1 1
4/1/2021 0 1 1 1.01
5/1/2021 5 1 -4 0.97
6/1/2021 0 0 0 0.97
What I'm trying if difference between per1 and per
is non zero then I am trying get overall impact in column per_of_per
in generic manner.
p.s.
- if
per1-per = 0
thenper_of_per = 0
- for the first non-zero value,
per1-per = per_of_per
- Afterward if
per1-per = 0
, most recentper_of_per
value if not thenper_of_per
= most recentper_of_per ((per1-per)/100)
CodePudding user response:
Here's another way. First create a boolean Series msk
that tracks whether per1-per
is 0 or not (after the first occurrence of non-zero, flag the rest as non-zero). Then use it to assign 1 to the first non-zero value and divide the remaining by 100; finally, find cumulative sum:
msk = df['per1-per'].ne(0).cummax()
df['per_of_per'] = (msk.where(lambda x: x.index==x.idxmax(), False)
df['per1-per'] * msk.shift().fillna(False) / 100).cumsum()
Output:
date per per1 per1-per per_of_per
0 1/1/2021 0 0 0 0.00
1 2/1/2021 0 0 0 0.00
2 3/1/2021 2 3 1 1.00
3 4/1/2021 0 1 1 1.01
4 5/1/2021 5 1 -4 0.97
5 6/1/2021 0 0 0 0.97
CodePudding user response:
If I understand correctly, this should work for you:
g = (df['per1-per'].shift(1).eq(0) | df['per1-per'].eq(0)).cumsum()
df['per_of_per'] = df['per1-per'].replace(0, np.nan).ffill().fillna(0)
df['per_of_per'] = df.groupby(g)['per1-per'].transform('first') df['per1-per'].mul(df['per1-per'].shift(1).ne(0).astype(int)).div(100).groupby(g).cumsum()
Output:
>>> df
date per per1 per1-per per_of_per
0 1/1/2021 0 0 0 0.00
1 2/1/2021 0 0 0 0.00
2 3/1/2021 2 3 1 1.00
3 4/1/2021 0 1 1 1.01
4 5/1/2021 5 1 -4 0.97
5 6/1/2021 0 0 0 0.97