Home > other >  Converting the available pandas Dataframe presently across monthly into quarterly values
Converting the available pandas Dataframe presently across monthly into quarterly values

Time:08-26

This is my available df, it contains year from 2016 to 2020

Year Month Bill 
-----------------
2016   1     2
2016   2     5
2016   3     10
2016   4     2
2016   5     4
2016   6     9
2016   7     7
2016   8     8
2016   9     9
2016   10    5
2016   11    1
2016   12    3
.
.
.
2020   12    10

Now I want to create a 2 new columns in this dataframe Level and Contribution. and the level column contain Q1, Q2, Q3, Q4 representing 4 quarters of the year and Contribution contains average value from bill column of each quarter in those 3 months of the respective year.

for example Q1 for 2016 will contains the average of month 1,2,3 of bill across **Contribution** 
and same for Q3 for year 2020 will contains average of month 7,8,9 of the year 2020 bill column in the Contribution Column, expected Dataframe is given below
Year Month  Bill  levels contribution
------------------------------------
2016   1     2     Q1      5.66
2016   2     5     Q1      5.66
2016   3     10    Q1      5.66
2016   4     2     Q2      5
2016   5     4     Q2      5
2016   6     9     Q2      5
2016   7     7     Q3      8
2016   8     8     Q3      8
2016   9     9     Q3      8
2016   10    5     Q4      3
2016   11    1     Q4      3
2016   12    3     Q4      3
.
.
2020   10    2     Q4      6 
2020   11    6     Q4      6
2020   12    10    Q4      6

This process will be repeated for each month 4 quarters Iam not able to figure out the as it is something new to me

CodePudding user response:

You can try:

df['levels'] = 'Q'   df['Month'].div(3).apply(math.ceil).astype(str)
df['contribution'] = df.groupby(['Year', 'levels'])['Bill'].transform('mean')

CodePudding user response:

Pandas actually has a set of datatypes for monthly and quarterly values called Pandas.Period.

See this similar question:

In your case it would look like this:

from datetime import datetime

# First create dates for 1st of each month period
df['Dates'] = [datetime(row['Year'], row['Month'], 1) for i, row in df[['Year', 'Month']].iterrows()]

# Create monthly periods
df['Month Periods'] = df['Dates'].dt.to_period('M')

# Use the new monthly index
df = df.set_index('Month Periods')

# Group by quarters
df_qtrly = df['Bill'].resample('Q').mean()
df_qtrly.index.names = ['Quarters']

print(df_qtrly)

Result:

Quarters
2016Q1    5.666667
2016Q2    5.000000
2016Q3    8.000000
2016Q4    3.000000
Freq: Q-DEC, Name: Bill, dtype: float64

If you want to put these values back into the monthly dataframe you could do this:

df['Quarters'] = df['Dates'].dt.to_period('Q')
df['Contributions'] = df_qtrly.loc[df['Quarters']].values
               Year  Month  Bill      Dates Quarters  Contributions
Month Periods                                                      
2016-01        2016      1     2 2016-01-01   2016Q1       5.666667
2016-02        2016      2     5 2016-02-01   2016Q1       5.666667
2016-03        2016      3    10 2016-03-01   2016Q1       5.666667
2016-04        2016      4     2 2016-04-01   2016Q2       5.000000
2016-05        2016      5     4 2016-05-01   2016Q2       5.000000
2016-06        2016      6     9 2016-06-01   2016Q2       5.000000
2016-07        2016      7     7 2016-07-01   2016Q3       8.000000
2016-08        2016      8     8 2016-08-01   2016Q3       8.000000
2016-09        2016      9     9 2016-09-01   2016Q3       8.000000
2016-10        2016     10     5 2016-10-01   2016Q4       3.000000
2016-11        2016     11     1 2016-11-01   2016Q4       3.000000
2016-12        2016     12     3 2016-12-01   2016Q4       3.000000
  • Related