Home > Enterprise >  Pandas - Sort dataframe but limit the amount in every page for a specific element in another column
Pandas - Sort dataframe but limit the amount in every page for a specific element in another column

Time:10-28

I was wondering if there's a way to sort a dataframe by a numeric value but keep just X ocurrencies based on another column? For example, let's say I want to use a dataframe as a catalog and paginate him (so every page would have 5 items). And in every 5 items I need to have at max 2 items of the categorical column.

product     seller
 10         seller1
 9          seller1
 8          seller2
 7          seller2
 6          seller2
 5          seller3

And then I would want something like:

product     seller
 10         seller1
 9          seller1
 8          seller2
 7          seller2
 5          seller3
 6          seller2

The last 2 lines change place because in the static 1-5 "page1" the seller2 already had 2 items.

CodePudding user response:

Let us try with cumcount

out = df.sort_values(by = 'seller', key = lambda x : df.groupby('seller').cumcount()//2)
Out[145]: 
   product   seller
0       10  seller1
1        9  seller1
2        8  seller2
3        7  seller2
5        5  seller3
4        6  seller2
  • Related