I want to calculate lags of multiple columns. I am able to do that for each column separately as shown below. How can I avoid the duplicate groupby and sorting.
### Pandas previous week values
search = search.assign(asp_lstwk2 = search.sort_values(by = 'firstdayofweek').groupby('asin_bk')['asp'].shift(1))\
.assign(lbb_lstwk2 = search.sort_values(by = 'firstdayofweek').groupby('asin_bk')['lbb'].shift(1))\
.assign(repoos_lstwk2 = search.sort_values(by = 'firstdayofweek').groupby('asin_bk')['repoos'].shift(1))\
.assign(ordered_units_lstwk2 = search.sort_values(by = 'firstdayofweek').groupby('asin_bk')['ordered_units'].shift(1))
CodePudding user response:
Try:
search = search.join(search.sort_values(by = 'firstdayofweek')
.groupby('asin_bk')[['asp','lbb','repoos','ordered_units']]
.shift().add_suffix('_lstwk2'))