How do I create a rank table for a given pandas dataframe with multiple numerical columns?-CodePudding

I would like to create a rank table based on a multi-column pandas dataframe, with several numerical columns.

Let's use the following df as an example:

Name	Sales	Volume	Reviews
A	1000	100	100
B	2000	200	50
C	5400	500	10

I would like to create a new table, ranked_df that ranks the values in each column by descending order while maintaining essentially the same format:

Name	Sales_rank	Volume_rank	Reviews_rank
A	3	3	1
B	2	2	2
C	1	1	3

Now, I can iteratively do this by looping through the columns, i.e.

df = pd.DataFrame{
"Name":['A', 'B', 'C'], 
"Sales":[1000, 2000, 5400], 
"Volume":[100, 200, 500], 
"Reviews":[1000, 2000, 5400]
}

# make a copy of the original df
ranked_df = df.copy()

# define our interested columns
interest_cols = ['Sales', 'Volume', 'Reviews']
for col in interest_cols:
    ranked_df[f"{col}_rank"] = df[col].rank()

# drop the cols not needed 
...

But my question is this: is there a more elegant - or pythonic way of doing this? Maybe an apply for the dataframe? Or some vectorized operation by throwing it to numpy?

Thank you.

CodePudding user response：

You could use transform/apply to hit each column

df.set_index('Name').transform(pd.Series.rank, ascending = False)

      Sales  Volume  Reviews
Name
A       3.0     3.0      1.0
B       2.0     2.0      2.0
C       1.0     1.0      3.0

CodePudding user response：

df.set_index('Name').rank().reset_index()

    Name    Sales   Volume  Reviews
0   A       1.0     1.0     1.0
1   B       2.0     2.0     2.0
2   C       3.0     3.0     3.0