I have a dataframe like as below
df = pd.DataFrame(
{'stud_id' : [101, 101, 101, 101,
101, 101, 101, 101],
'sub_code' : ['CSE01', 'CSE01', 'CSE01',
'CSE01', 'CSE02', 'CSE02',
'CSE02', 'CSE02'],
'ques_date' : ['10/11/2022', '06/06/2022','09/04/2022', '27/03/2022',
'13/05/2010', '10/11/2021','11/1/2022', '27/02/2022'],
'revenue' : [77, 86, 55, 90,
65, 90, 80, 67]}
)
df['ques_date'] = pd.to_datetime(df['ques_date'])
I would like to do the below
a) Compute custom quarter based on our organization FY calendar. Meaning, Oct-Dec is Q1, Jan -Mar is Q2,Apr - Jun is Q3 and July to Sep is Q4.
b) Group by stud_id
c) Compute sum of revenue from previous two quarters (from a specific date = 20/12/2022). For example, if we are in 2023Q1
, I would like to get the sum of revenue for a customer from 2022Q4
and 2022Q3
seperately
So, I tried the below
df['custom_qtr'] = pd.to_datetime(df['ques_date'], dayfirst=True).dt.to_period('Q-SEP')
date_1 = pd.to_datetime('20-12-2022')
df['date_based_qtr'] = date_1.to_period('Q-SEP')
pat = '(Q(\d ))'
df['custom_qtr_number'] = df['custom_qtr'].astype(str).str.extract(pat, expand=False)[1]
df['date_qtr_number'] = df['date_based_qtr'].astype(str).str.extract(pat, expand=False)[1]
But am not sure, how to reshape the dataframe and get an output like below. You can see that we are 2023Q1
and I would like to get the sum of revenue from previous two quarters seperately
. Meaning, revenue from 2022Q4 and 2022Q3 respectively
CodePudding user response:
For 2 previous quarters subtract date_based_qtr
by custom_qtr
with converting to integers and test membership by 1,2
, last pivoting:
df['ques_date'] = pd.to_datetime(df['ques_date'])
df['custom_qtr'] = pd.to_datetime(df['ques_date'], dayfirst=True).dt.to_period('Q-SEP')
date_1 = pd.to_datetime('20-12-2022')
df['date_based_qtr'] = date_1.to_period('Q-SEP')
df = df[df['date_based_qtr'].sub(df['custom_qtr']).apply(lambda x: x.n).isin([1,2])]
df = df.pivot(['stud_id','date_based_qtr'],'custom_qtr','revenue')
print (df)
custom_qtr 2022Q3 2022Q4
stud_id date_based_qtr
101 2023Q1 86 55