Selecting Items in dataframe-CodePudding

Using Python 3

I have a dataframe sort of like this:

productCode   productType   storeCode   salesAmount  moreInfo
    111            1            111          111       info
    111            1            112          112       info
    456            4            456          456       info

and so on for thousands of rows

I want to select (and have a list with the codes for) the X amount of the best selling unique products for each different store.

How would I accomplish that?

CodePudding user response：

Data:

df = pd.DataFrame({'productCode': [111,111,456,123,125],
                   'productType' : [1,1,4,3,3],
                   'storeCode' : [111,112,112,456,456],
                   'salesAmount' : [111,112,34,456,1235]})


   productCode  productType  storeCode  salesAmount
0          111            1        111          111
1          111            1        112          112
2          456            4        112           34
3          123            3        456          456
4          125            3        456         1235

It sounds like you want the best selling product at each storeCode? In which case:

df.sort_values('salesAmount', ascending=False).groupby('storeCode').head(1)

   productCode  productType  storeCode  salesAmount
4          125            3        456         1235
1          111            1        112          112
0          111            1        111          111

Instead, if you want the best selling of each productType at each storeCode, then:

df.sort_values('salesAmount', ascending=False).groupby(['storeCode', 'productType']).head(1)

   productCode  productType  storeCode  salesAmount
4          125            3        456         1235
1          111            1        112          112
0          111            1        111          111
2          456            4        112           34