Home > Mobile >  Grouping pivot table dates by week in pandas
Grouping pivot table dates by week in pandas

Time:11-05

I have a df that looks like this:

date            category            subcategory         order_id
2021-05-04      A                   aa                  10
2021-06-04      A                   dd                  10
2021-05-06      B                   aa                  18
2021-07-06      A                   aa                  50
2021-07-06      C                   cc                  10
2021-07-05      A                   ff                  101
2021-10-04      D                   aa                  100

I am trying to get the count of unique orders per week, category & subcategory:

pd.pivot_table(df, index=['date', 'category', 'subcategory'],
                           values=['order_id'],
                           aggfunc=lambda x: len(x.unique())).unstack(level=0)

Which do return the result but not grouped by week, I am wondering if it's possible to group the results during the groupby operation? I've checked this answer but it looks like it is outdated as I get an error:

TypeError: resample() got an unexpected keyword argument 'how'

CodePudding user response:

You can use groupby with DataFrameGroupBy.resample, then ouput is:

f = lambda x: x.nunique()
df = df.set_index('date').groupby(['category','subcategory'])['order_id'].resample('W').agg(f).unstack()
print (df)
date                  2021-05-09  2021-05-16  2021-05-23  2021-05-30  \
category subcategory                                                   
A        aa                  1.0         0.0         0.0         0.0   
         dd                  NaN         NaN         NaN         NaN   
         ff                  NaN         NaN         NaN         NaN   
B        aa                  1.0         NaN         NaN         NaN   
C        cc                  NaN         NaN         NaN         NaN   
D        aa                  NaN         NaN         NaN         NaN   

date                  2021-06-06  2021-06-13  2021-06-20  2021-06-27  \
category subcategory                                                   
A        aa                  0.0         0.0         0.0         0.0   
         dd                  1.0         NaN         NaN         NaN   
         ff                  NaN         NaN         NaN         NaN   
B        aa                  NaN         NaN         NaN         NaN   
C        cc                  NaN         NaN         NaN         NaN   
D        aa                  NaN         NaN         NaN         NaN   

date                  2021-07-04  2021-07-11  2021-10-10  
category subcategory                                      
A        aa                  0.0         1.0         NaN  
         dd                  NaN         NaN         NaN  
         ff                  NaN         1.0         NaN  
B        aa                  NaN         NaN         NaN  
C        cc                  NaN         1.0         NaN  
D        aa                  NaN         NaN         1.0  

Or if use Grouper output is:

f = lambda x: x.nunique()
df = df.groupby(['category','subcategory', pd.Grouper(freq='W', key='date')])['order_id'].agg(f).unstack()
print (df)
date                  2021-05-09  2021-07-11  2021-06-06  2021-10-10
category subcategory                                                
A        aa                  1.0         1.0         NaN         NaN
         dd                  NaN         NaN         1.0         NaN
         ff                  NaN         1.0         NaN         NaN
B        aa                  1.0         NaN         NaN         NaN
C        cc                  NaN         1.0         NaN         NaN
D        aa                  NaN         NaN         NaN         1.0
  • Related