grouped1=exp.groupby("country")["value"].sum().reset_index()
grouped2=imp.groupby("country")["value"].sum().reset_index()
grouped=grouped1.merge(grouped2,on="country")
grouped.rename(columns={"value_x":"export_to","value_y":"import_from"},inplace=True)
grouped
output:
I want to sort dataframe by the sum of export_to and import_from
I tried this:
grouped.sort_values(grouped.export_to grouped.import_from)
CodePudding user response:
In my opinion, there's nothing wrong to add a new column to sort by. But of course, we can replace the index with the series of sums as an option:
(
grouped
.set_index(grouped.export_to grouped.import_from)
.sort_index()
.reset_index(drop=True)
)
We can hide all this machinery deeper inside sort_index
:
grouped.sort_index(
ignore_index=True,
key=lambda index: grouped.loc[index, ['export_to','import_from']].sum('columns')
)
The process is the same: an index has been replaced here by a key-function and then sorted. After that reordered index is dropped due to ignore_index=True
.
CodePudding user response:
You can create a temporary column for total and drop it after sorting by it.
df.assign(total=df['export_to'] df['import_from']).sort_values('total').drop(columns='total')
country export_to import_from
3 ANDORA 6.28 5.82
1 ALBANIA 196.51 524.18
0 AFG 4790.19 2682.23
2 ALGERIA 8232.24 10185.12
Even if you don't drop the total_column, it will not be added to the DF permanently, and the result will still be the same. though shows in the result
country export_to import_from total
3 ANDORA 6.28 5.82 12.10
1 ALBANIA 196.51 524.18 720.69
0 AFG 4790.19 2682.23 7472.42
2 ALGERIA 8232.24 10185.12 18417.36
df.assign(total=df['export_to'] df['import_from']).sort_values('total')