I have the following dataset:
Name Loc Site Date Total
Alex Italy A 12.31.2020 30
Alex Italy B 12.31.2020 40
Alex Italy B 12.30.2020 100
Alex Italy A 12.30.2020 80
Alex France A 12.28.2020 10
Alex France B 12.28.2020 20
Alex France B 12.27.2020 10
I want to add per each row the average of total in the day before the date per Name, Loc and Date
This is the outcome I'm looking for:
Name Loc Site Date Total Prv_Avg
Alex Italy A 12.31.2020 30 90
Alex Italy B 12.31.2020 40 90
Alex Italy B 12.30.2020 100 NULL
Alex Italy A 12.30.2020 80 NULL
Alex France A 12.28.2020 10 10
Alex France B 12.28.2020 20 10
Alex France B 12.27.2020 10 NULL
The Nulls are for rows where previous date couldn't be found.
I've tried rolling but got mixed up with the index.
CodePudding user response:
First aggregate mean
per 3 columns, add one day to MultiIndex
for match previous day and last use DataFrame.join
:
df['Date'] = pd.to_datetime(df['Date'])
s = df.groupby(['Name','Loc','Date'])['Total'].mean().rename('Prv_Avg')
print (s)
Name Loc Date
Alex France 2020-12-27 10
2020-12-28 15
Italy 2020-12-30 90
2020-12-31 35
Name: Prv_Avg, dtype: int64
s = s.rename(lambda x: x pd.Timedelta('1 day'), level=2)
print (s)
Name Loc Date
Alex France 2020-12-28 10
2020-12-29 15
Italy 2020-12-31 90
2021-01-01 35
Name: Prv_Avg, dtype: int64
df = df.join(s, on=['Name','Loc','Date'])
print (df)
Name Loc Site Date Total Prv_Avg
0 Alex Italy A 2020-12-31 30 90.0
1 Alex Italy B 2020-12-31 40 90.0
2 Alex Italy B 2020-12-30 100 NaN
3 Alex Italy A 2020-12-30 80 NaN
4 Alex France A 2020-12-28 10 10.0
5 Alex France B 2020-12-28 20 10.0
6 Alex France B 2020-12-27 10 NaN
CodePudding user response:
Another possible solution:
grp = ['Name', 'Loc', 'Date']
s = df.groupby(grp)['Total'].mean().shift().rename('Prv_Avg')
idx1 = s.index.get_level_values('Name').to_list()
idx2 = s.index.get_level_values('Loc').to_list()
df.merge(s.where((idx1 == np.roll(idx1, 1)) &
(idx2 == np.roll(idx2, 1))).reset_index(),
on= grp)
Output:
Name Loc Site Date Total Prv_Avg
0 Alex Italy A 12.31.2020 30 90.0
1 Alex Italy B 12.31.2020 40 90.0
2 Alex Italy B 12.30.2020 100 NaN
3 Alex Italy A 12.30.2020 80 NaN
4 Alex France A 12.28.2020 10 10.0
5 Alex France B 12.28.2020 20 10.0
6 Alex France B 12.27.2020 10 NaN