How to replace values of a columns given a specific scheme?-CodePudding

I have a dataframe with the column log_index that could look like this

Now, I want to start at 1 so that the column is changed as follows

log_index
27 -> 1
27 -> 1
27 -> 1
27 -> 1
28 -> 2
28 -> 2
29 -> 3
29 -> 3
29 -> 3
27 -> 4
27 -> 4
27 -> 4
28 -> 5
28 -> 5
28 -> 5

how could I achieve this the most efficient way?

CodePudding user response：

You can use a cumsum of the boolean Series comparing one value to the previous (with shift):

df['new'] = df['log_index'].ne(df['log_index'].shift()).cumsum()

NB. assign back to df['log_index'] to modify in place.

Output:

    log_index  new
0          27    1
1          27    1
2          27    1
3          27    1
4          28    2
5          28    2
6          29    3
7          29    3
8          29    3
9          27    4
10         27    4
11         27    4
12         28    5
13         28    5
14         28    5