** I have Input data frame **
ID | Date | Amount |
---|---|---|
A | 2021-08-03 | 100 |
A | 2021-08-04 | 100 |
A | 2021-08-06 | 20 |
A | 2021-08-07 | 100 |
A | 2021-08-09 | 300 |
A | 2021-08-11 | 100 |
A | 2021-08-12 | 100 |
A | 2021-08-13 | 10 |
A | 2021-08-23 | 10 |
A | 2021-08-24 | 10 |
A | 2021-08-26 | 10 |
A | 2021-08-28 | 10 |
desired Output data frame
ID | Date | Amount | TwoWeekSum |
---|---|---|---|
A | 2021-08-03 | 100 | 320 |
A | 2021-08-04 | 100 | 320 |
A | 2021-08-06 | 20 | 320 |
A | 2021-08-07 | 100 | 320 |
A | 2021-08-09 | 300 | 830 |
A | 2021-08-11 | 100 | 830 |
A | 2021-08-12 | 100 | 830 |
A | 2021-08-13 | 10 | 830 |
A | 2021-08-23 | 10 | 40 |
A | 2021-08-24 | 10 | 40 |
A | 2021-08-26 | 10 | 40 |
A | 2021-08-28 | 10 | 40 |
I want to calculate the last two week total sum like twoweekSum= current week total sum Previous Week total sum i.e. current week is 34 then twoweekSum is 34 week total sum 33 week total sum.
Please help me in to get in this in like output data frame so I can use that for further analysis. Thank You folks !
CodePudding user response:
Use:
#convert values to datetimes
df['Date'] = pd.to_datetime(df['Date'])
#convert values to weeks
df['week'] = df['Date'].dt.isocalendar().week
#aggregate sum per ID and weeks, then add missing weeks and sum in rolling
f = lambda x: x.reindex(range(x.index.min(), x.index.max() 1))
.rolling(2, min_periods=1).sum()
df1 = df.groupby(['ID', 'week'])['Amount'].sum().reset_index(level=0).groupby('ID').apply(f)
print (df1)
Amount
ID week
A 31 320.0
32 830.0
33 510.0
34 40.0
#last add to original DataFrame per ID and weeks
df=df.join(df1.rename(columns={'Amount':'TwoWeekSum'}),on=['ID','week']).drop('week',axis=1)
print (df)
ID Date Amount TwoWeekSum
0 A 2021-08-03 100 320.0
1 A 2021-08-04 100 320.0
2 A 2021-08-06 20 320.0
3 A 2021-08-07 100 320.0
4 A 2021-08-09 300 830.0
5 A 2021-08-11 100 830.0
6 A 2021-08-12 100 830.0
7 A 2021-08-13 10 830.0
8 A 2021-08-23 10 40.0
9 A 2021-08-24 10 40.0
10 A 2021-08-26 10 40.0
11 A 2021-08-28 10 40.0
CodePudding user response:
Try using groupby
to group the dataframe by dt.week
, then use transform
sum
to add up the values weekly and repeat the values:
df['TwoWeekSum'] = df.groupby(df['Date'].dt.week)['Amount'].transform('sum')
And then:
print(df)
Gives:
ID Date Amount TwoWeekSum
0 A 2021-08-03 100 320
1 A 2021-08-04 100 320
2 A 2021-08-06 20 320
3 A 2021-08-07 100 320
4 A 2021-08-09 300 830
5 A 2021-08-11 100 830
6 A 2021-08-12 100 830
7 A 2021-08-13 10 830
8 A 2021-08-23 10 40
9 A 2021-08-24 10 40
10 A 2021-08-26 10 40
11 A 2021-08-28 10 40
CodePudding user response:
per = pd.period_range(df['Date'].min(), df['Date'].max(), freq='w')
mapper = df.groupby(df['Date'].astype('Period[W]')).sum().reindex(per, fill_value=0).rolling(2, 1).sum()['Amount']
out = df['Date'].astype('Period[W]').map(mapper)
out
0 320.0
1 320.0
2 320.0
3 320.0
4 830.0
5 830.0
6 830.0
7 830.0
8 40.0
9 40.0
10 40.0
11 40.0
Name: Date, dtype: float64
make out
to TwoWeekSum column
df.assign(TwoWeekSum=out)
ID Date Amount TwoWeekSum
0 A 2021-08-03 100 320.0
1 A 2021-08-04 100 320.0
2 A 2021-08-06 20 320.0
3 A 2021-08-07 100 320.0
4 A 2021-08-09 300 830.0
5 A 2021-08-11 100 830.0
6 A 2021-08-12 100 830.0
7 A 2021-08-13 10 830.0
8 A 2021-08-23 10 40.0
9 A 2021-08-24 10 40.0
10 A 2021-08-26 10 40.0
11 A 2021-08-28 10 40.0
Update
if each ID
, groupby and merge
per = pd.period_range(df['Date'].min(), df['Date'].max(), freq='w')
s = df['Date'].astype('Period[W]')
idx = pd.MultiIndex.from_product([df['ID'].unique(), per])
df1 = df.groupby(['ID', s]).sum().reindex(idx, fill_value=0).rolling(2, 1).agg(sum).reset_index().set_axis(['ID', 'period', 'TwoWeekSum'], axis=1)
df.assign(period=s).merge(df1, how='left').drop('period', axis=1)