Home > Software engineering >  Pandas divide value and melt Quarter rows into month columns
Pandas divide value and melt Quarter rows into month columns

Time:09-05

I have a dataframe like as below

df1 = pd.DataFrame({'s_id': [18,23],'SUB':['Phy','Phy'],'Rev_Q1':[80,100],'Rev_Q2':[60,50],'Rev_Q3':[20,12],'Rev_Q4':[10,12]})

I would like to do the below

a) Split the revenue from each quarter equally into 3 months (for each quarter). For ex: If you look at Rev_Q1, the value is 80 for s_id = 18. So, we divide that by 3 (because 3 months in a qtr). So, 80/3 = 26.667

b) Pivot the rows and store them as columns

I tried the below but this is clearly not elegant/efficient

df1['Jan'] = df1.loc[:,df1.columns.str.contains("Q1")]/3
df1['Feb'] = df1.loc[:,df1.columns.str.contains("Q1")]/3
df1['Mar'] = df1.loc[:,df1.columns.str.contains("Q1")]/3
df1['Apr'] = df1.loc[:,df1.columns.str.contains("Q2")]/3
df1['May'] = df1.loc[:,df1.columns.str.contains("Q2")]/3
df1['Jun'] = df1.loc[:,df1.columns.str.contains("Q2")]/3
df1['Jul'] = df1.loc[:,df1.columns.str.contains("Q3")]/3
df1['Aug'] = df1.loc[:,df1.columns.str.contains("Q3")]/3
df1['Sep'] = df1.loc[:,df1.columns.str.contains("Q3")]/3
df1['Oct'] = df1.loc[:,df1.columns.str.contains("Q4")]/3
df1['Nov'] = df1.loc[:,df1.columns.str.contains("Q4")]/3
df1['Dec'] = df1.loc[:,df1.columns.str.contains("Q4")]/3
pd.melt(df1, id_vars=['s_id','SUB'],value_vars=['Jan','Feb','Mar','Apr','May','Jun','Jul','Aug','Sep','Oct','Nov','Dec'],var_name='month_name',value_name='rev')

Is there any other better way to write this with elegant piece of code?

I expect my output to be like as below (Divide each qtr value by 3 (bcoz 3 months) and assign that value to each month in the quarter and melt the dataframe)

Am showing a sample output for Qtr 1. Similarly, I have to do for all quarters in the below format

enter image description here

CodePudding user response:

You can use a mapping dictionary and explode:

quarters = {'Q1': ['Jan','Feb','Mar'], 'Q2': ['Apr','May','Jun'],
            'Q3': ['Jul','Aug','Sep'], 'Q4': ['Oct', 'Nov', 'Dec']
           }

out = (df1
    .melt(['s_id', 'SUB'], value_name='rev', var_name='qtr')
    .assign(rev=lambda d: d['rev'].div(3),
            qtr=lambda d: d['qtr'].str[-2:].map(quarters)
           )
    .explode('qtr')
)

output:

   s_id  SUB  qtr        rev
0    18  Phy  Jan  26.666667
0    18  Phy  Feb  26.666667
0    18  Phy  Mar  26.666667
1    23  Phy  Jan  33.333333
1    23  Phy  Feb  33.333333
1    23  Phy  Mar  33.333333
2    18  Phy  Apr  20.000000
2    18  Phy  May  20.000000
2    18  Phy  Jun  20.000000
3    23  Phy  Apr  16.666667
3    23  Phy  May  16.666667
3    23  Phy  Jun  16.666667
4    18  Phy  Jul   6.666667
4    18  Phy  Aug   6.666667
4    18  Phy  Sep   6.666667
5    23  Phy  Jul   4.000000
5    23  Phy  Aug   4.000000
5    23  Phy  Sep   4.000000
6    18  Phy  Oct   3.333333
6    18  Phy  Nov   3.333333
6    18  Phy  Dec   3.333333
7    23  Phy  Oct   4.000000
7    23  Phy  Nov   4.000000
7    23  Phy  Dec   4.000000
  • Related