Home > Enterprise >  How to calculate monthly and weekly averages from a dataframe using python?
How to calculate monthly and weekly averages from a dataframe using python?

Time:12-02

The below is my dataframe. How to calculate both monthly and weekly averages from this dataframe in python? I need to print month start&end and week start&end then the average of the month and week

**Input
SAMPLE DATASET**            
kpi_id  kpi_name    run_date    value
1       MTTR       5/17/2021    15
2       MTTR       5/18/2021    16
3       MTTR       5/19/2021    17
4       MTTR       5/20/2021    18
5       MTTR       5/21/2021    19
6       MTTR       5/22/2021    20
7       MTTR       5/23/2021    21
8       MTTR       5/24/2021    22
9       MTTR       5/25/2021    23
10      MTTR       5/26/2021    24
11      MTTR       5/27/2021    25


**expected output**
**monthly_mean**            
kpi_name    month_start month_end   value(mean)
    MTTR    5/1/2021    5/31/2021   20
**weekly_mean**         
kpi_name    week_start  week_end    value(mean)
    MTTR    5/17/2021   5/23/2021   18
    MTTR    5/24/2021   5/30/2021   23.5

CodePudding user response:

groupby is your friend

monthly = df.groupby(pd.Grouper(key='run_date', freq='M')).mean() 
weekly = df.groupby(pd.Grouper(key='run_date', freq='W')).mean() 

CodePudding user response:

Extending the answer from igrolvr to match your expected output:

# transformation from string to datetime for groupby
df['run_date'] = pd.to_datetime(df['run_date'], format='%m/%d/%Y')

# monthly
# The groupby() will extract the last date in period, 
# reset_index() will make the group by keys to columns
monthly = df.groupby(by=['kpi_name', pd.Grouper(key='run_date', freq='M')])['value'].mean().reset_index()

# Getting the start of month 
monthly['month_start'] = monthly['run_date'] - pd.offsets.MonthBegin(1)

# Renaming the run_date column to month_end
monthly = monthly.rename({'run_date': 'month_end'}, axis=1)
print(monthly)

# weekly
# The groupby() will extract the last date in period, 
# reset_index() will make the group by keys to columns
weekly = df.groupby(by=['kpi_name', pd.Grouper(key='run_date', freq='W')])['value'].mean().reset_index()

# Getting the start of week and adding one day, 
# because start of week is sunday
weekly['week_start'] = weekly['run_date'] - pd.offsets.Week(1)   pd.offsets.Day(1)

# Renaming the run_date column to week_end
weekly = weekly.rename({'run_date': 'week_end'}, axis=1)
print(weekly)
  • Related