Home > Enterprise >  How to keep counting although it start at 1 again
How to keep counting although it start at 1 again

Time:11-16

My df looks as follows:

import pandas as pd
d = {'col1': [1,2,3,3,1,2,2,3,4,1,1,2]
df= pd.DataFrame(data=d)

Now I want to add a new column with the following schemata:

col1 new_col
1 1
2 2
3 3
3 3
3 3
1 4
2 5
2 5
3 6
4 7
1 8
1 8
2 9

Once it starts again at 1 it should just keep counting.

At the moment I am at the point where I just add a column with difference:

df['diff'] = df['col1'].diff()

How to extend this approach?

CodePudding user response:

Try with

df.col1.diff().ne(0).cumsum()
Out[94]: 
0     1
1     2
2     3
3     3
4     4
5     5
6     5
7     6
8     7
9     8
10    8
11    9
Name: col1, dtype: int32

CodePudding user response:

Try:

df["new_col"] = df["col1"].ne(df["col1"].shift()).cumsum()

>>> df
    col1  new_col
0      1        1
1      2        2
2      3        3
3      3        3
4      1        4
5      2        5
6      2        5
7      3        6
8      4        7
9      1        8
10     1        8
11     2        9
  • Related