I have monthly time series data as follows:
Item 2020 Jan 2020 Feb 2020 Mar 2020 Jul 2020 Aug 2020 Sep
0 A 0 1 2 3 4 5
1 B 5 4 3 2 1 0
I needed to convert this monthly data into quarterly data as a sum of three months, so I adapted the code from here to do just that:
df = df.groupby(pd.PeriodIndex(df.columns, freq='Q'), axis=1).sum()
The resulting df
is as follows:
Item 2020 Q1 2020 Q3
0 A 3 12
1 B 12 3
However, I wanted to have all quarters listed in between, not just the ones that have a value larger than 0. So my desired output is as follows:
Item 2020 Q1 2020 Q2 2020 Q3
0 A 3 0 12
1 B 12 0 3
So in this output, there is an additional column Q2
because it is between the first and the last quarter periods, even though it has no positive values. Does anybody know how to do that?
CodePudding user response:
Use DataFrame.reindex
with period_range
:
df = df.set_index('Item')
df.columns = pd.to_datetime(df.columns)
df = df.groupby(pd.PeriodIndex(df.columns, freq='Q'), axis=1).sum()
df = df.reindex(pd.period_range(df.columns.min(), df.columns.max(), freq='Q'), axis=1, fill_value=0)
print (df)
2020Q1 2020Q2 2020Q3
Item
A 3 0 12
B 12 0 3