Home > Software engineering >  How to shift some columns while the date index/column of dataframe also expands synchronously
How to shift some columns while the date index/column of dataframe also expands synchronously

Time:10-28

Given a data sample df, I'll move the value2 and value3 downward by one and two cells respectively, and you can see the following results:

        date     value1     value2     value3
0  2022-4-30  17.625324        NaN  23.195545
1  2022-5-31  21.158160  24.363270  19.404603
2  2022-6-30  19.346424  22.739159  10.832226
3  2022-7-31   9.413752  15.231505   9.954975
4  2022-8-31   0.468775   6.069972   6.238546
5  2022-9-30  -0.369166   2.310004   4.926893

df = df.set_index('date') 

df['value2'] = df['value2'].shift(1)
df['value3'] = df['value3'].shift(2)

Out:

              value1     value2     value3
date                                      
2022-4-30  17.625324        NaN        NaN
2022-5-31  21.158160        NaN        NaN
2022-6-30  19.346424  24.363270  23.195545
2022-7-31   9.413752  22.739159  19.404603
2022-8-31   0.468775  15.231505  10.832226
2022-9-30  -0.369166   6.069972   9.954975

But my expected result will be like this:

               value1     value2     value3
date              
2022-4-30   17.625324        NaN        NaN
2022-5-31   21.158160        NaN        NaN
2022-6-30   19.346424  24.363270  23.195545
2022-7-31    9.413752  22.739159  19.404603
2022-8-31    0.468775  15.231505  10.832226
2022-9-30   -0.369166   6.069972   9.954975
2022-10-31        NaN   2.310004   6.238546
2022-11-30        NaN        NaN   4.926893

How to achieve this result? Thanks!

CodePudding user response:

You need to compute the new dates, reindex, then shift:

# ensure datetime index
df.index = pd.to_datetime(df.index)

# max number of shifts to be performed later
max_shift = 2

# compute extra dates and reindex
new_idx = pd.date_range(df.index.min(),
                        df.index.max() pd.DateOffset(months=max_shift),
                        freq='M')

df = df.reindex(new_idx)

# shift
df['value2'] = df['value2'].shift(1)
df['value3'] = df['value3'].shift(2)

output:

               value1     value2     value3
2022-04-30  17.625324        NaN        NaN
2022-05-31  21.158160        NaN        NaN
2022-06-30  19.346424  24.363270  23.195545
2022-07-31   9.413752  22.739159  19.404603
2022-08-31   0.468775  15.231505  10.832226
2022-09-30  -0.369166   6.069972   9.954975
2022-10-31        NaN   2.310004   6.238546
2022-11-30        NaN        NaN   4.926893
  • Related