Home > Net >  Tricky sorting of quarter values to a specific order while grouping in Python
Tricky sorting of quarter values to a specific order while grouping in Python

Time:11-23

I have a dataset where I would like to rearrange and sort quarter values in numerical order, grouping by the 'id' column

Data

    id  date    stat
    aa  q1 22   y
    aa  q1 23   y
    aa  q2 22   y
    aa  q2 23   y
    aa  q3 22   y
    aa  q3 23   y
    aa  q4 22   y
    aa  q4 23   ok
    bb  q1 22   n
    bb  q1 23   n
    bb  q2 22   n
    bb  q2 23   n
    bb  q3 22   n
    bb  q3 23   n
    bb  q4 22   n
    bb  q4 23   ok

Desired

 id date    stat
aa  q1 22   y
aa  q2 22   y
aa  q3 22   y
aa  q4 22   y
aa  q1 23   y
aa  q2 23   y
aa  q3 23   ok
aa  q4 23   n
bb  q1 22   n
bb  q2 22   n
bb  q3 22   n
bb  q4 22   n
bb  q1 23   n
bb  q2 23   n
bb  q3 23   n
bb  q4 23   ok

Doing

Since my data is in quarters, I am using this

import pandas as pd    
pd.to_datetime(date).sort_values().to_period('Q')

However, I also need to group these by the 'id' column as the desired output shows. Any suggestion is appreciated

CodePudding user response:

Rename axis, split q to extract integer, sort by

df[['temp1','temp2']]=df['date'].str.split('\s', expand=True)
df=df.sort_values(by=['id','temp2']).drop(columns=['temp1', 'temp2'])

    id   date stat
0   aa  q1 22    y
1   aa  q2 22    y
2   aa  q3 22    y
3   aa  q4 22    y
4   aa  q1 23    y
5   aa  q2 23    y
6   aa  q3 23   ok
7   aa  q4 23    n
8   bb  q1 22    n
9   bb  q2 22    n
10  bb  q3 22    n
11  bb  q4 22    n
12  bb  q1 23    n
13  bb  q2 23    n
14  bb  q3 23    n
15  bb  q4 23   ok

CodePudding user response:

This should do the job:

import pandas as pd
pd.to_datetime(df['date'])
df.sort_values(by=['id', 'date'], inplace=True)
df.index = df['date']
df.index.to_period("Q")

Explanation:

df.sort_values(by=['id', 'date'], inplace=True) will first sort your data on the id column and then it'll sort that sorted data(i.e., on id column) on date column.

  • Related