Pandas -> DataFrame -> rank by price-CodePudding

I have the DataFrame, where I am trying to add a new "rank" column to determine the price rating relative to the "name" and "country" columns by comparing prices (column 'price'). If one product's price is the same, when using the

df['rank'] = df.groupby('name')['price'].apply(lambda x: x.sort_values().rank())

I get the following result -> column 'rank', but I need to get the one that is highlighted in the 'rank_2' and it is not accurate, because these six products have the same price and should get a rating of 1. How is it possible to get the given result as in the column -> 'rank_2'. Help please, I will be grateful

CodePudding user response：

you have to select the method of ranking in the rank function, like so :

df['rank'] = df.groupby('name')['price'].apply(lambda x: x.sort_values().rank(method="dense"))

CodePudding user response：

If I understood you correctly:

You can use:

df['rank'] = df.sort_values(by=['name', 'price']).groupby(['name'])[['price']].apply(lambda x: x!= x.shift()).cumsum()

df['rank'] = df.sort_values(by=['name', 'price']).groupby('name')['price'].apply(lambda x: x.rank(method="dense"))

Output in both cases:

     name country  price  rank
0  S00123     mal    3.5   1.0
1  S00123     fra    3.5   1.0
2  S00123     spa    3.5   1.0
3  S00123     pur    3.5   1.0
4  S00123     rom    3.5   1.0
5  S00123     slo    3.5   1.0
6  S00123     jap    7.0   2.0
7  S00123     can    8.5   3.0
8  S00123     bra    8.5   3.0
9  S00123     ind   10.0   4.0