Home > database >  Resample daily time series to business day
Resample daily time series to business day

Time:04-27

I have the following daily time series that I want to resample (aggregate sum) for only business days (Mon - Fri)

enter image description here

but this code also aggregate the weekends (Sat & Sun)

df_resampled = df.resample('5B').sum()

enter image description here

CodePudding user response:

You can exclude weekends in boolean indexing with DatetimeIndex.dayofweek:

df_resampled = df[~df.index.dayofweek.isin([5,6])].resample('5B').sum()

df_resampled = df[df.index.dayofweek < 5].resample('5B').sum()

CodePudding user response:

You can pivot the table on day of week and remove weekends. Check this out.

Step 0: generate random example (You already have data so you shouldn't really care about this step)

import pandas as pd
import numpy as np

def random_dates(start, end, n, freq, seed=None):
    if seed is not None:
        np.random.seed(seed)
    dr = pd.date_range(start, end, freq=freq)
    return pd.to_datetime(np.sort(np.random.choice(dr, n, replace=False)))

dates = random_dates('2015-01-01', '2018-01-01', 10, 'H', seed=[3, 1415])

df = pd.DataFrame()
df.index = pd.DatetimeIndex(dates.date)
df['Sales'] = np.random.randint(1, 5, size=len(df))

Step 1: Get days of week

df['Day of week'] = df.index.to_series().dt.dayofweek
# 0 is Monday - 6 is Saturday 

Step 2: Get the result you asked for

# remove days 5 and 6 (Sun and Sat) and pivot on day of week
result = df[df['Day of week'] < 5].pivot_table(index = 'Day of week', values = "Sales", aggfunc = "sum")

print(result)

Example output:

             Sales
Day of week       
0               11
3                1
4               14

Again remember: 0 is Monday - 6 is Saturday. You can change these to names to get a more beautiful output.

  • Related