Home > Software design >  Panda dataframe sort by date while keeping a certain order in second column
Panda dataframe sort by date while keeping a certain order in second column

Time:04-29

I have multiple time-series dataframes from different countries. I want to merge these together, keeping the order in the dates (instead of merely putting one dataframe below the other) while at the same time making sure that for each date the column next to it which has the country index has a consistent pattern. However, when I do this the countries seem to be randomly distributed. For one date Australia is first but for another date, Japan is put first.

To clarify with an example:

Australia

      country   crime-index
2000  AU        100
2001  AU        110
2002  AU        120

Japan

      country   crime-index
2000  JP        90
2001  JP        100
2002  JP        95

United Kingdom

      country   crime-index
2000  UK        120
2001  UK        130
2002  UK        130

Merged

      country   crime-index
2000  AU        100
2000  JP        90
2000  UK        120
2001  AU        110
2001  JP        100
2001  UK        130
2002  AU        120
2002  JP        95
2002  UK        130

CodePudding user response:

You can simply use the sort_values functions of pandas to sort your dataframe by multiple columns or together with index. With this, the ordering of the country column will be the same for each date.

df.rename_axis('dates').sort_values(["dates","country"])

CodePudding user response:

You can try with

df['temp'] = df.index
df.sort_values(['temp', 'country'])
del df['temp']

df['temp'] copies the dates in a column, then sort values by two columns

  • Related