Home > Enterprise >  Get sort string array by column value in a pandas DataFrame
Get sort string array by column value in a pandas DataFrame

Time:11-30

Be the following python pandas DataFrame.

| date          | days       | country   |
| ------------- | ---------- | --------- |
| 2022-02-01    | 1          |  Spain    |
| 2022-02-02    | 2          |  Spain    |
| 2022-02-01    | 3          |  Italy    |
| 2022-02-03    | 2          |  France   |
| 2022-02-03    | 1          |  Germany  |
| 2022-02-04    | 1          |  Italy    |
| 2022-02-04    | 1          |  UK       |
| 2022-02-05    | 2          |  UK       |
| 2022-02-04    | 5          |  Spain    |
| 2022-02-04    | 1          |  Portugal |

I want to get a ranking by country according to its number of days.

| country          | count_days  |
| ---------------- | ----------- |
| Spain            | 8           |
| Italy            | 4           |
| UK               | 3           |
| France           | 2           |
| Germany          | 1           |
| Portugal         | 1           |

Finally I want to return the countries from most to least number of rows in a string array.

return: countries = ['Spain', 'Italy', 'UK', 'France', 'Germany', 'Portugal']

CodePudding user response:

Firat aggreagte sum, then sorting values and convert to DataFrame:

df1 = (df.groupby('country')['days']
         .sum()
         .sort_values(ascending=False)
         .reset_index(name='count_days'))
print (df1)
    country  count_days
0     Spain           8
1     Italy           4
2        UK           3
3    France           2
4   Germany           1
5  Portugal           1

Last convert column to list:

countries = df1['country'].tolist()

Solution without DataFrame df1:

countries = df.groupby('country')['days'].sum().sort_values(ascending=False).index.tolist()
  • Related