I have a dataframe for inventory like the following,
tidx = pd.date_range('2022-10-01', periods=15, freq='D')
data_frame = pd.DataFrame(1, columns=['inventory'], index=tidx)
data_frame.iloc[-2:] = 0
print(data_frame)
inventory
2022-10-01 1
2022-10-02 1
2022-10-03 1
2022-10-04 1
2022-10-05 1
2022-10-06 1
2022-10-07 1
2022-10-08 1
2022-10-09 1
2022-10-10 1
2022-10-11 1
2022-10-12 1
2022-10-13 1
2022-10-14 0
2022-10-15 0
I want to aggregate for 7 days from any day of the week (here 2020-10-15
). If I do the following, I don't know why the results start from 2020-10-13
data_frame.resample("7D", closed = 'right', origin='2020-10-15').sum()
inventory
2022-09-29 6
2022-10-06 7
2022-10-13 0
My desired output is,
inventory
2022-09-01 1
2022-10-08 7
2022-10-15 5
Note: My panda's version is '1.3.5'
CodePudding user response:
data_frame.resample("7D", origin='2022-10-15', closed='right', loffset='7D').sum()
Gives:
inventory
2022-10-01 1
2022-10-08 7
2022-10-15 5
Although loffset
is deprecated. So you can do this instead.
from pandas.tseries.frequencies import to_offset
df = data_frame.resample("7D", origin='2022-10-15', closed='right').sum()
df.index = df.index to_offset("7D")
Additionally, a more elegant solution to this would be:
data_frame.resample("7D", origin='end', closed='right').sum()
CodePudding user response:
I found a fair solution,
data_frame.resample("W-SAT", closed = 'right').sum()
inventory
2022-10-01 1
2022-10-08 7
2022-10-15 5
The only caveat is to calculate the start date name which is SAT
(for 2022-10-15
)