Home > Net >  Pandas Grouping Weekly Data
Pandas Grouping Weekly Data

Time:12-30

I want to group data based on each week and found the solution in the following post, https://www.statology.org/pandas-group-by-week/

According to the post, following the two operations below allows you to group the data per week.

# Create a dataframe with datetime values
df = pd.DataFrame({'date': pd.date_range(start='1/5/2022', freq='D', periods=15),
                   'sales': [6, 8, 9, 5, 4, 8, 8, 3, 5, 9, 8, 3, 4, 7, 7]})

#convert date column to datetime and subtract one week
df['date'] = pd.to_datetime(df['date']) - pd.to_timedelta(7, unit='d')

#calculate sum of values, grouped by week
df.groupby([pd.Grouper(key='date', freq='W')])['sales'].sum()

The question I have is whats the point of operation 1, where you subtract 7 days from the original date. How does this help with the grouping as all the dates are translated 7 days back? Couldn't you just use the second operation to group the dates ?

Thanks

CodePudding user response:

This is likely a trick to set the week start as reference, the best would be to use directly the label='left' parameter of pandas.Grouper:

df.groupby(pd.Grouper(key='date', freq='W', label='left'))['sales'].sum()

Output:

date
2022-01-02    32
2022-01-09    44
2022-01-16    18
Freq: W-SUN, Name: sales, dtype: int64

alternative

df.groupby(pd.to_datetime(df['date']).dt.to_period('W'))['sales'].sum()

Output:

2022-01-03/2022-01-09    32
2022-01-10/2022-01-16    44
2022-01-17/2022-01-23    18
Freq: W-SUN, Name: sales, dtype: int64
  • Related