Home > Net >  Create a new column that starts counting from 0 until the value in another column changes
Create a new column that starts counting from 0 until the value in another column changes

Time:06-21

I have a dataframe df that looks like this:

ID Months Borrow_Rank
1 0 1
1 1 1
1 2 1
2 0 1
2 1 1
2 2 1
3 0 2
3 1 2
4 0 1
4 1 1

I want to create a new variable Months_Adjusted that starts counting from 0 for as long as Borrow_Rank remains the same.

ID Months Borrow_Rank Months_Adjusted
1 0 1 0
1 1 1 1
1 2 1 2
2 0 1 3
2 1 1 4
2 2 1 5
3 0 2 0
3 1 2 1
4 0 1 0
4 1 1 1

Thank you all and I apologise if I could have written the question better. This is my first post.

CodePudding user response:

import pandas as pd

df = pd.DataFrame({'Borrow_Rank':[1,1,1,1,1,1,1,2,2,2,2,2,3,3,3,1,1,1]})
selector = (df['Borrow_Rank'] != df['Borrow_Rank'].shift()).cumsum()
df['Months_Adjusted'] = df.groupby(selector).cumcount()

CodePudding user response:

Here is a solution using itertools.groupby -

from itertools import groupby
df['Months_Adjusted'] = pd.concat(
          [pd.Series(range(len(list(g)))) 
           for k, g in groupby(df['Borrow_Rank'])],
ignore_index=True)

Output

   ID  Months  Borrow_Rank  Months_Adjusted
0   1       0            1                0
1   1       1            1                1
2   1       2            1                2
3   2       0            1                3
4   2       1            1                4
5   2       2            1                5
6   3       0            2                0
7   3       1            2                1
8   4       0            1                0
9   4       1            1                1
  • Related