Home > OS >  How to sort a Pandas DataFrame by multiple columns in Python with ordered number
How to sort a Pandas DataFrame by multiple columns in Python with ordered number

Time:09-30

I have a dataframe which has 3 columns.

Code Volume Trade Value
Apple 500 1000
Amazon 1000 500
Facebook 250 750
Samsung 100 1500

Firstly, I want to sort volume columns ascending: False that means largest to smallest. Afterwards, making new column which name is "Volume Order" and giving number as a sorting number. Secondly, I want to sort Trade Value ascending: False that means largest to smallest. Afterwards, making new column which name is "Trade Order" and giving number as a sorting number.But, when I sort Trade value, Sorted volume column order is changing accoring to trade volume order. Finally, I have 2 sorted columns and 2 ordered columns. Lastly, I want to sort and create a new and final column by taking the maximum of 2 separate columns sorted.(Volume Order- Trade Order).

expected output(when volume and trade value sorting in order;

Code Volume Order Trade Order
Samsung 4 1
Apple 2 2
Facebook 3 3
Amazon 1 4

Final output

Code Max Ordered Number Final Sorted Number
Samsung 4 1
Amazon 4 2
Facebook 3 3
Apple 2 4

Note: if they are the same max ordered number, the trade value will be the more important, which will be expressed by the finale of the larger one. Example; samsung 4-4. But samsung has biggest trade order. That's why it is the larger one as final sorted numbers.

When I try this code; I can't make it. Can you help this issue?

ordered = volume_df.sort_values(by=['Volume'], ascending=False,ignore_index=True)
ordered['Volume Order'] = ordered.index 1 
ordered = trade_value_df.sort_values(by=['Trade Value'], ascending=False,ignore_index=True)
ordered['Trade Order'] = ordered.index 1 

CodePudding user response:

Instead of using sort_values and index use rank:

df['Volume Order'] = df['Volume'].rank(ascending=False)
df['Trade Order']  = df['Trade'].rank(ascending=False)
df['Trade Order2']  = df['Trade'].rank(ascending=True)
df['Max Ordered Number']  = df[['Volume Order', 'Trade Order']].max(axis=1)
df['Final Sorted Number'] = df[['Max Ordered Number', 'Trade Order2']].apply(tuple, axis=1).rank(ascending=False)
df.drop('Trade Order2', axis=1)

output:

       Code  Volume  Trade  Value  Volume Order  Trade Order  Max Ordered Number  Final Sorted Number
0     Apple     500   1000   <NA>             2            2                   2                    4
1    Amazon    1000    500   <NA>             1            4                   4                    2
2  Facebook     250    750   <NA>             3            3                   3                    3
3   Samsung     100   1500   <NA>             4            1                   4                    1

CodePudding user response:

There is an issue when you write:

ordered = trade_value_df.sort_values(by=['Trade Value'], ascending=False,ignore_index=True)

You are assigning something new to the name ordered, so you're effectively losing the dataframe you had previously assigned to that name.

A possibility is to do all the operations on the same dataframe, rather than have multiple dataframe:

import pandas as pd

df = pd.DataFrame({'Code':['Apple', 'Amazon', 'Facebook', 'Samsung'], 'Volume':[500, 1000, 250, 100], 'Trade Value': [1000, 500, 750, 1500]})

df = df.sort_values(by=['Volume'], ascending=False,ignore_index=True)
df['Volume Order'] = df.index   1

df = df.sort_values(by=['Trade Value'], ascending=False,ignore_index=True)
df['Trade Order'] = df.index   1

print(df)
#        Code  Volume  Trade Value  Volume Order  Trade Order
# 0   Samsung     100         1500             4            1
# 1     Apple     500         1000             2            2
# 2  Facebook     250          750             3            3
# 3    Amazon    1000          500             1            4
  • Related