Home > Enterprise >  How do I create a rank table for a given pandas dataframe with multiple numerical columns?
How do I create a rank table for a given pandas dataframe with multiple numerical columns?

Time:09-16

I would like to create a rank table based on a multi-column pandas dataframe, with several numerical columns.

Let's use the following df as an example:

Name Sales Volume Reviews
A 1000 100 100
B 2000 200 50
C 5400 500 10

I would like to create a new table, ranked_df that ranks the values in each column by descending order while maintaining essentially the same format:

Name Sales_rank Volume_rank Reviews_rank
A 3 3 1
B 2 2 2
C 1 1 3

Now, I can iteratively do this by looping through the columns, i.e.

df = pd.DataFrame{
"Name":['A', 'B', 'C'], 
"Sales":[1000, 2000, 5400], 
"Volume":[100, 200, 500], 
"Reviews":[1000, 2000, 5400]
}

# make a copy of the original df
ranked_df = df.copy()

# define our interested columns
interest_cols = ['Sales', 'Volume', 'Reviews']
for col in interest_cols:
    ranked_df[f"{col}_rank"] = df[col].rank()

# drop the cols not needed 
...

But my question is this: is there a more elegant - or pythonic way of doing this? Maybe an apply for the dataframe? Or some vectorized operation by throwing it to numpy?

Thank you.

CodePudding user response:

You could use transform/apply to hit each column

df.set_index('Name').transform(pd.Series.rank, ascending = False)

      Sales  Volume  Reviews
Name
A       3.0     3.0      1.0
B       2.0     2.0      2.0
C       1.0     1.0      3.0

CodePudding user response:

df.set_index('Name').rank().reset_index()

    Name    Sales   Volume  Reviews
0   A       1.0     1.0     1.0
1   B       2.0     2.0     2.0
2   C       3.0     3.0     3.0
  • Related