How to get rank of a column by filtering by two columns?-CodePudding

I have the data frame below. I want to get a create a new column called rank that ranks the stock by growth for each year and sector.

import pandas as pd
d = {'date': [2000, 2000, 2001, 2001, 2002, 2002, 2003, 2003, 2004, 2004],  'Stock': ["GM", "F", "GM", "F", "GM", "F", "GM", "F", "GM", "F"], 'US Sales Growth': [.3, .3, .1, .1, .6, .6, .12, .12, .7, .7], 'Stock_Growth': [.1, .2, .3, .4, .14, .16,  .2, .1,.15,.16]}
df = pd.DataFrame(data=d)

It should look like this.

d = {'date': [2000, 2000, 2001, 2001, 2002, 2002, 2003, 2003, 2004, 2004],  'Stock': ["GM", "F", "GM", "F", "GM", "F", "GM", "F", "GM", "F"], 'Sector': ['Auto', 'Auto', 'Auto', 'Auto', 'Auto', 'Auto', 'Auto', 'Auto', 'Auto', 'Auto'],  'US Sales Growth': [.3, .3, .1, .1, .6, .6, .12, .12, .7, .7], 'Stock_Growth': [.1, .2, .3, .4, .14, .16,  .2, .1,.15,.16], 'Rank': [2, 1, 2, 1, 2, 1, 1, 2, 2, 1]}

df2 = pd.DataFrame(data=d)

I tried doing this but just gave me NANs

df['place_holder'] = df.groupby(['Stock','date'], as_index=False)['Stock_Growth'].transform(lambda x: x.shift(1))
df['rank']  = df.groupby('date', as_index=False)['place_holder'].transform(lambda x: x.rank())

Another way of doing it was get groupby date first then sector and then rank them.

CodePudding user response：

Using groupby and then apply another step to rank each group seems the right approach. I recommend using the rank method of a Groupby object to compute the rank of the values of a column in each group. This means, that this should provide the expected output:

>> df["Rank"] = df.groupby(["date", "Sector"])["Stock_Growth"].rank(ascending=False)
>> df
   date Stock  US Sales Growth  Stock_Growth Sector  Rank
0  2000    GM             0.30          0.10   Auto   2.0
1  2000     F             0.30          0.20   Auto   1.0
2  2001    GM             0.10          0.30   Auto   2.0
3  2001     F             0.10          0.40   Auto   1.0
4  2002    GM             0.60          0.14   Auto   2.0
5  2002     F             0.60          0.16   Auto   1.0
6  2003    GM             0.12          0.20   Auto   1.0
7  2003     F             0.12          0.10   Auto   2.0
8  2004    GM             0.70          0.15   Auto   2.0
9  2004     F             0.70          0.16   Auto   1.0

The ascending=False part ensures that higher values get a better (lower) rank.