so I have a series, I want to cumsum, but start over every time I hit a 0, somthing like this:
orig | wanted result | |
---|---|---|
0 | 0 | 0 |
1 | 1 | 1 |
2 | 1 | 2 |
3 | 1 | 3 |
4 | 1 | 4 |
5 | 1 | 5 |
6 | 1 | 6 |
7 | 0 | 0 |
8 | 1 | 1 |
9 | 1 | 2 |
10 | 1 | 3 |
11 | 0 | 0 |
12 | 1 | 1 |
13 | 1 | 2 |
14 | 1 | 3 |
15 | 1 | 4 |
16 | 1 | 5 |
17 | 1 | 6 |
any ideas? (pandas, pure python, other)
CodePudding user response:
Use df['orig'].eq(0).cumsum()
to generate groups starting on each 0, then cumcount
to get the increasing values:
df['result'] = df.groupby(df['orig'].eq(0).cumsum()).cumcount()
output:
orig wanted result result
0 0 0 0
1 1 1 1
2 1 2 2
3 1 3 3
4 1 4 4
5 1 5 5
6 1 6 6
7 0 0 0
8 1 1 1
9 1 2 2
10 1 3 3
11 0 0 0
12 1 1 1
13 1 2 2
14 1 3 3
15 1 4 4
16 1 5 5
17 1 6 6
Intermediate:
df['orig'].eq(0).cumsum()
0 1
1 1
2 1
3 1
4 1
5 1
6 1
7 2
8 2
9 2
10 2
11 3
12 3
13 3
14 3
15 3
16 3
17 3
Name: orig, dtype: int64
CodePudding user response:
import pandas as pd
condition = df.Orig.eq(0)
df['reset'] = condition.cumsum()