Home > Software design >  Cumulative difference of numbers starting from an initial value
Cumulative difference of numbers starting from an initial value

Time:11-11

I have a Pandas dataframe containing a series of numbers:

df = pd.DataFrame({'deduction':[10,60,70,50,60,10,10,60,60,20,50,20,10,90,60,70,30,50,40,60]})

    deduction
0          10
1          60
2          70
3          50
4          60
5          10
6          10
7          60
8          60
9          20
10         50
11         20
12         10
13         90
14         60
15         70
16         30
17         50
18         40
19         60

I would like to compute the cumulative difference of these numbers, starting from a larger number (i.e. <base_number> - 10 - 60 - 70 - 50 - ...).

My current solution is to negate all the numbers, prepend the (positive) larger number to the dataframe, and then call cumsum():

# Compact:
(-df['deduction'][::-1]).append(pd.Series([start_value], index=[-1]))[::-1].cumsum().reset_index(drop=True)

# Expanded:
total_series = (
    # Negate
    (-df['deduction']
    # Reverse
    [::-1])
    # Add the base value to the end
    .append(pd.Series([start_value]))
    # Reverse again (to put the base value at the beginning)
    [::-1]
    # Calculate cumulative sum (all the values except the first are negative, so this will work)
    .cumsum()
    # Clean up
    .reset_index(drop=True)
)

But I was wondering if there were possible a shorter solution, that didn't append to the series (I hear that that's bad practice).

(It doesn't need to be put in a dataframe; a series, like I've done above, will be alright.)

CodePudding user response:

df['total'] = start_value - df["deduction"].cumsum() 

if you need the start value at the beginning of the series then shift and insert (there's a few ways to do it, and this is one of them)

df['total'] = -df["deduction"].shift(1).fillna(-start_value).cumsum()
  • Related