I have a pandas data-frame (there are other columns that I'm not including for sake of simplicity) that looks like this
Country City Rank
USA New York 7
UK London 6
UK London 6
USA New York 4
Spain Madrid 9
USA New York 8
USA Boston 5
... ... ...
I'm trying to group by country and then by city. Moreover, I want to compute count, mean and median and then sort first by mean and then by count.
I have been able to get the data in the following format using this bit of code:
df_3 = df_2[["Country", "City", "Rank"]].groupby(
["Country", "City"]).agg(['count', 'mean', 'median'])
Rank
count mean median
Country City
Spain Madrid 1 -- --
USA Boston 1 -- --
New York 3 -- --
UK London 2 -- --
Now, for each country, I want to be able to sort cities first by mean and then by count, both in descending order. I don't know how to do it. Any help highly appreciated.
CodePudding user response:
I think you would want to maintain the grouping in level=Country
of multiindex in such a case you first have to reset the index then sort the values on Country
, mean
and count
then set the index again
cols = ['Country', 'City']
(
df
.groupby(cols)['Rank']
.agg(['count', 'mean', 'median'])
.reset_index()
.sort_values(['Country', 'mean', 'count'], ascending=[1, 0, 0])
.set_index(cols)
)
Result
count mean median
Country City
Spain Madrid 1 9.000000 9.0
UK London 2 6.000000 6.0
USA New York 3 6.333333 7.0
Boston 1 5.000000 5.0