Home > database >  Sorting an Dataframe wtih Index in form of List
Sorting an Dataframe wtih Index in form of List

Time:02-18

I have a pandas df that I need to sort based on a fixed order given in a list. The problem that I'm having is that the sort I'm attempting is not moving the data rows in the designated order that I'm expecting from the order of the list. My list and DataFrame (df) looks like this:

months = ['5','6','7','8','9','10','11','12','1','2']

df =

year       1992      1993      1994      1995
month                                        
1     -0.343107 -0.211959  0.437974 -1.219363
2     -0.383353  0.888650  1.054926  0.714846
5      0.057198  1.246682  0.042684  0.275701
6     -0.100018 -0.801554  0.001111  0.382633
7     -0.283815  0.204448  0.350705  0.130652
8      0.042195 -0.433849 -1.481228 -0.236004
9      1.059776  0.875214  0.304638  0.127819
10    -0.328911 -0.256656  1.081157  1.057449
11    -0.488213 -0.957050 -0.813885  1.403822
12     0.973031 -0.246714  0.600157  0.579038

The closest that I have gotten is this -

newdf = pd.DataFrame(df.values, index=list(months))

...but it does not move the rows. This command only adds the months in the index column w/out moving the data.

           0         1         2         3
5  -0.343107 -0.211959  0.437974 -1.219363
6  -0.383353  0.888650  1.054926  0.714846
7   0.057198  1.246682  0.042684  0.275701
8  -0.100018 -0.801554  0.001111  0.382633
9  -0.283815  0.204448  0.350705  0.130652
10  0.042195 -0.433849 -1.481228 -0.236004
11  1.059776  0.875214  0.304638  0.127819
12 -0.328911 -0.256656  1.081157  1.057449
1  -0.488213 -0.957050 -0.813885  1.403822
2   0.973031 -0.246714  0.600157  0.579038

I need the result to look like -

year    1992      1993      1994      1995
month
5   0.057198  1.246682  0.042684  0.275701
6  -0.100018 -0.801554  0.001111  0.382633
7  -0.283815  0.204448  0.350705  0.130652
8   0.042195 -0.433849 -1.481228 -0.236004
9   1.059776  0.875214  0.304638  0.127819
10 -0.328911 -0.256656  1.081157  1.057449
11 -0.488213 -0.957050 -0.813885  1.403822
12  0.973031 -0.246714  0.600157  0.579038
1  -0.343107 -0.211959  0.437974 -1.219363
2  -0.383353  0.888650  1.054926  0.714846

CodePudding user response:

Assuming df.index is dtype('int64'), first convert months to integers. Then use loc:

months = [*map(int, months)]
out = df.loc[months]

If df.index is dtype('O'), you can use loc right away, i.e. you don't need the first line.

Output:

year       1992      1993      1994      1995
month                                        
5      0.057198  1.246682  0.042684  0.275701
6     -0.100018 -0.801554  0.001111  0.382633
7     -0.283815  0.204448  0.350705  0.130652
8      0.042195 -0.433849 -1.481228 -0.236004
9      1.059776  0.875214  0.304638  0.127819
10    -0.328911 -0.256656  1.081157  1.057449
11    -0.488213 -0.957050 -0.813885  1.403822
12     0.973031 -0.246714  0.600157  0.579038
1     -0.343107 -0.211959  0.437974 -1.219363
2     -0.383353  0.888650  1.054926  0.714846
  • Related