I have a Pandas dataframe containing a series of numbers:
df = pd.DataFrame({'deduction':[10,60,70,50,60,10,10,60,60,20,50,20,10,90,60,70,30,50,40,60]})
deduction
0 10
1 60
2 70
3 50
4 60
5 10
6 10
7 60
8 60
9 20
10 50
11 20
12 10
13 90
14 60
15 70
16 30
17 50
18 40
19 60
I would like to compute the cumulative difference of these numbers, starting from a larger number (i.e. <base_number> - 10 - 60 - 70 - 50 - ...
).
My current solution is to negate all the numbers, prepend the (positive) larger number to the dataframe, and then call cumsum()
:
# Compact:
(-df['deduction'][::-1]).append(pd.Series([start_value], index=[-1]))[::-1].cumsum().reset_index(drop=True)
# Expanded:
total_series = (
# Negate
(-df['deduction']
# Reverse
[::-1])
# Add the base value to the end
.append(pd.Series([start_value]))
# Reverse again (to put the base value at the beginning)
[::-1]
# Calculate cumulative sum (all the values except the first are negative, so this will work)
.cumsum()
# Clean up
.reset_index(drop=True)
)
But I was wondering if there were possible a shorter solution, that didn't append to the series (I hear that that's bad practice).
(It doesn't need to be put in a dataframe; a series, like I've done above, will be alright.)
CodePudding user response:
df['total'] = start_value - df["deduction"].cumsum()
if you need the start value at the beginning of the series then shift and insert (there's a few ways to do it, and this is one of them)
df['total'] = -df["deduction"].shift(1).fillna(-start_value).cumsum()