Home > front end >  How to replace values of a columns given a specific scheme?
How to replace values of a columns given a specific scheme?

Time:01-13

I have a dataframe with the column log_index that could look like this

log_index
27
27
27
27
28
28
29
29
29
27
27
27
28
28
28

Now, I want to start at 1 so that the column is changed as follows

log_index
27 -> 1
27 -> 1
27 -> 1
27 -> 1
28 -> 2
28 -> 2
29 -> 3
29 -> 3
29 -> 3
27 -> 4
27 -> 4
27 -> 4
28 -> 5
28 -> 5
28 -> 5

how could I achieve this the most efficient way?

CodePudding user response:

You can use a cumsum of the boolean Series comparing one value to the previous (with shift):

df['new'] = df['log_index'].ne(df['log_index'].shift()).cumsum()

NB. assign back to df['log_index'] to modify in place.

Output:

    log_index  new
0          27    1
1          27    1
2          27    1
3          27    1
4          28    2
5          28    2
6          29    3
7          29    3
8          29    3
9          27    4
10         27    4
11         27    4
12         28    5
13         28    5
14         28    5
  • Related