Home > Enterprise >  Pandas groupby with each group treated as a unique group to get cumsum of another column
Pandas groupby with each group treated as a unique group to get cumsum of another column

Time:02-21

Similar question to this one

Pandas groupby with each group treated as a unique group

but instead of having the cumsum of df['dir'], the cumsum of another column should be gotten with the changing boolean 1, 0.

My desired output - I want new cummulative sum of new_exist as the boolean changes

df  = pd.DataFrame({"dir":[1,1,1,1,0,0,0,1,1,1,1,0,0,0],
                  "grp": [1,2,3,4,1,2,3,1,2,3,4,1,2,3,], 
                  "new_exist":[0.1,0.2,0.5,1,1,2,1,1,2,2,4,1,2,3,],
                  "new_exist_cum":[0.1,0.3,0.8,1.8,1,3,4,1,3,5,4,5,7,10,]  })

CodePudding user response:

IIUC, use a custom group (from the change in value of "dir") and groupby cumsum the other column (here "col2"):

group = df['dir'].ne(df['dir'].shift()).cumsum()

df['new'] = df.groupby(group)['col2'].cumsum()

Output:

    dir  col2  new
0     1     0    0
1     1     1    1
2     1     2    3
3     1     3    6
4     0     4    4
5     0     5    9
6     0     6   15 # example: 4 5 6
7     1     7    7 # new group, restart cumsum
8     1     8   15
9     1     9   24
10    1    10   34
11    0    11   11
12    0    12   23
13    0    13   36
  • Related