The below is my dataframe. How to calculate both monthly and weekly averages from this dataframe in python? I need to print month start&end and week start&end then the average of the month and week
**Input
SAMPLE DATASET**
kpi_id kpi_name run_date value
1 MTTR 5/17/2021 15
2 MTTR 5/18/2021 16
3 MTTR 5/19/2021 17
4 MTTR 5/20/2021 18
5 MTTR 5/21/2021 19
6 MTTR 5/22/2021 20
7 MTTR 5/23/2021 21
8 MTTR 5/24/2021 22
9 MTTR 5/25/2021 23
10 MTTR 5/26/2021 24
11 MTTR 5/27/2021 25
**expected output**
**monthly_mean**
kpi_name month_start month_end value(mean)
MTTR 5/1/2021 5/31/2021 20
**weekly_mean**
kpi_name week_start week_end value(mean)
MTTR 5/17/2021 5/23/2021 18
MTTR 5/24/2021 5/30/2021 23.5
CodePudding user response:
groupby
is your friend
monthly = df.groupby(pd.Grouper(key='run_date', freq='M')).mean()
weekly = df.groupby(pd.Grouper(key='run_date', freq='W')).mean()
CodePudding user response:
Extending the answer from igrolvr to match your expected output:
# transformation from string to datetime for groupby
df['run_date'] = pd.to_datetime(df['run_date'], format='%m/%d/%Y')
# monthly
# The groupby() will extract the last date in period,
# reset_index() will make the group by keys to columns
monthly = df.groupby(by=['kpi_name', pd.Grouper(key='run_date', freq='M')])['value'].mean().reset_index()
# Getting the start of month
monthly['month_start'] = monthly['run_date'] - pd.offsets.MonthBegin(1)
# Renaming the run_date column to month_end
monthly = monthly.rename({'run_date': 'month_end'}, axis=1)
print(monthly)
# weekly
# The groupby() will extract the last date in period,
# reset_index() will make the group by keys to columns
weekly = df.groupby(by=['kpi_name', pd.Grouper(key='run_date', freq='W')])['value'].mean().reset_index()
# Getting the start of week and adding one day,
# because start of week is sunday
weekly['week_start'] = weekly['run_date'] - pd.offsets.Week(1) pd.offsets.Day(1)
# Renaming the run_date column to week_end
weekly = weekly.rename({'run_date': 'week_end'}, axis=1)
print(weekly)