I have a dataframe with the column log_index
that could look like this
log_index
27
27
27
27
28
28
29
29
29
27
27
27
28
28
28
Now, I want to start at 1
so that the column is changed as follows
log_index
27 -> 1
27 -> 1
27 -> 1
27 -> 1
28 -> 2
28 -> 2
29 -> 3
29 -> 3
29 -> 3
27 -> 4
27 -> 4
27 -> 4
28 -> 5
28 -> 5
28 -> 5
how could I achieve this the most efficient way?
CodePudding user response:
You can use a cumsum
of the boolean Series comparing one value to the previous (with shift
):
df['new'] = df['log_index'].ne(df['log_index'].shift()).cumsum()
NB. assign back to df['log_index']
to modify in place.
Output:
log_index new
0 27 1
1 27 1
2 27 1
3 27 1
4 28 2
5 28 2
6 29 3
7 29 3
8 29 3
9 27 4
10 27 4
11 27 4
12 28 5
13 28 5
14 28 5