I have a dataframe like below
store_id | Rank | 3-Modulo-of-Rank | Modulo_Group |
---|---|---|---|
2345 | 1 | 1 | G1 |
123 | 2 | 2 | G2 |
324 | 3 | 0 | G3 |
241 | 4 | 1 | G1 |
111 | 5 | 2 | G2 |
124 | 6 | 0 | G3 |
This dataframe is sorted in order of rank.
I would like to group every 3 rows of this data like below based on G1, G2 and G3
store_id | Rank | 3-Modulo-of-Rank | Modulo_Group | Key group |
---|---|---|---|---|
2345 | 1 | 1 | G1 | K1 |
123 | 2 | 2 | G2 | K1 |
324 | 3 | 0 | G3 | K1 |
241 | 4 | 1 | G1 | K2 |
111 | 5 | 2 | G2 | K2 |
124 | 6 | 0 | G3 | K2 |
-- | -- | - | - | K3 |
etc.
CodePudding user response:
Use groupby_cumcount
:
>>> df['Key group'] = 'K' df.groupby('Modulo_Group').cumcount().add(1).astype(str)
Output:
>>> df
store_id Rank 3-Modulo-of-Rank Modulo_Group Key group
0 2345 1 1 G1 K1
1 123 2 2 G2 K1
2 324 3 0 G3 K1
3 241 4 1 G1 K2
4 111 5 2 G2 K2
5 124 6 0 G3 K2