How to keep counting although it start at 1 again-CodePudding

My df looks as follows:

import pandas as pd
d = {'col1': [1,2,3,3,1,2,2,3,4,1,1,2]
df= pd.DataFrame(data=d)

Now I want to add a new column with the following schemata:

col1	new_col
1	1
2	2
3	3
3	3
3	3
1	4
2	5
2	5
3	6
4	7
1	8
1	8
2	9

Once it starts again at 1 it should just keep counting.

At the moment I am at the point where I just add a column with difference:

df['diff'] = df['col1'].diff()

How to extend this approach?

CodePudding user response：

Try with

df.col1.diff().ne(0).cumsum()
Out[94]: 
0     1
1     2
2     3
3     3
4     4
5     5
6     5
7     6
8     7
9     8
10    8
11    9
Name: col1, dtype: int32

CodePudding user response：

Try:

df["new_col"] = df["col1"].ne(df["col1"].shift()).cumsum()

>>> df
    col1  new_col
0      1        1
1      2        2
2      3        3
3      3        3
4      1        4
5      2        5
6      2        5
7      3        6
8      4        7
9      1        8
10     1        8
11     2        9