Home > Back-end >  How to generate a column from the unique groupby combinations in a cumulative way?
How to generate a column from the unique groupby combinations in a cumulative way?

Time:10-03

My data looks like this:

df_dict = {
    'Year' : [2021, 2021, 2021, 2021, 2022, 2022, 2022, 2022],
    'Week of Year' : [1, 1, 2, 2, 10, 10, 11, 11]
}
  
df = pd.DataFrame(df_dict)

How can I generate a new column, say Week Order that shows the unique Year, Week of Year combinations in a cumulative way. The resulting data set will be like this:

    Year    Week of Year    Week Order
0   2021    1               1
1   2021    1               1
2   2021    2               2
3   2021    2               2
4   2022    10              3
5   2022    10              3
6   2022    11              4
7   2022    11              4

CodePudding user response:

You can use pandas.factorize :

df['Week Order'] = df.agg(tuple, axis=1).factorize()[0] 1)

# Output :

print(df)

   Year  Week of Year  Week Order
0  2021             1          1
1  2021             1          1
2  2021             2          2
3  2021             2          2
4  2022            10          3
5  2022            10          3
6  2022            11          4
7  2022            11          4

CodePudding user response:

here is one way to do it

df['week order']=1
df['week order']=df['week order'].mask(df.duplicated()).cumsum().ffill().astype(int) 
df

    Year    Week of Year    week order
0   2021               1        1
1   2021               1        1
2   2021               2        2
3   2021               2        2
4   2022              10        3
5   2022              10        3
6   2022              11        4
7   2022              11        4

CodePudding user response:

Another option, sort_values duplicated cumsum, i.e. every non duplicated Year Week increases the order by one:

cols = ['Year', 'Week of Year']
df['Week Order'] = (~df.sort_values(cols).duplicated(cols)).cumsum()

df
   Year  Week of Year  Week Order
0  2021             1           1
1  2021             1           1
2  2021             2           2
3  2021             2           2
4  2022            10           3
5  2022            10           3
6  2022            11           4
7  2022            11           4
  • Related