Given a data sample df
, I'll move the value2
and value3
downward by one and two cells respectively, and you can see the following results:
date value1 value2 value3
0 2022-4-30 17.625324 NaN 23.195545
1 2022-5-31 21.158160 24.363270 19.404603
2 2022-6-30 19.346424 22.739159 10.832226
3 2022-7-31 9.413752 15.231505 9.954975
4 2022-8-31 0.468775 6.069972 6.238546
5 2022-9-30 -0.369166 2.310004 4.926893
df = df.set_index('date')
df['value2'] = df['value2'].shift(1)
df['value3'] = df['value3'].shift(2)
Out:
value1 value2 value3
date
2022-4-30 17.625324 NaN NaN
2022-5-31 21.158160 NaN NaN
2022-6-30 19.346424 24.363270 23.195545
2022-7-31 9.413752 22.739159 19.404603
2022-8-31 0.468775 15.231505 10.832226
2022-9-30 -0.369166 6.069972 9.954975
But my expected result will be like this:
value1 value2 value3
date
2022-4-30 17.625324 NaN NaN
2022-5-31 21.158160 NaN NaN
2022-6-30 19.346424 24.363270 23.195545
2022-7-31 9.413752 22.739159 19.404603
2022-8-31 0.468775 15.231505 10.832226
2022-9-30 -0.369166 6.069972 9.954975
2022-10-31 NaN 2.310004 6.238546
2022-11-30 NaN NaN 4.926893
How to achieve this result? Thanks!
CodePudding user response:
You need to compute the new dates, reindex
, then shift
:
# ensure datetime index
df.index = pd.to_datetime(df.index)
# max number of shifts to be performed later
max_shift = 2
# compute extra dates and reindex
new_idx = pd.date_range(df.index.min(),
df.index.max() pd.DateOffset(months=max_shift),
freq='M')
df = df.reindex(new_idx)
# shift
df['value2'] = df['value2'].shift(1)
df['value3'] = df['value3'].shift(2)
output:
value1 value2 value3
2022-04-30 17.625324 NaN NaN
2022-05-31 21.158160 NaN NaN
2022-06-30 19.346424 24.363270 23.195545
2022-07-31 9.413752 22.739159 19.404603
2022-08-31 0.468775 15.231505 10.832226
2022-09-30 -0.369166 6.069972 9.954975
2022-10-31 NaN 2.310004 6.238546
2022-11-30 NaN NaN 4.926893