Home > Back-end >  Iterate over multiple columns and replace the values a prior a row (increment) with null values in t
Iterate over multiple columns and replace the values a prior a row (increment) with null values in t

Time:08-26

Given a dataframe df as follows:

          date      value     20211003     20211010     20211017
0    2021-9-19  3613.9663          NaN          NaN          NaN
1    2021-9-26  3613.0673          NaN          NaN          NaN
2    2021-10-3  3568.1668          NaN          NaN          NaN
3   2021-10-10  3592.1666  3510.221000          NaN          NaN
4   2021-10-17  3572.3662  3465.737012  3534.220800          NaN
5   2021-10-24  3582.6036  3479.107035  3539.856801  3514.420400
6   2021-10-31  3547.3361  3421.161235  3481.911001  3456.474600
7    2021-11-7  3491.5677  3370.140147  3439.284539  3416.621024
8   2021-11-14  3539.1002  3319.289523  3391.930037  3370.079953
9   2021-11-21  3560.3734  3261.343723  3333.984237  3312.134153
10  2021-11-28  3564.0894  3255.328902  3338.967086  3305.054247
11   2021-12-5  3607.4320  3313.274702  3396.912886  3363.000047
12  2021-12-12  3666.3479  3371.220502  3450.172564  3412.234440
13  2021-12-19  3632.3638          NaN  3466.930383  3428.683490
14  2021-12-26  3618.0535          NaN          NaN  3370.737690

Let's say the columns after value column (20211003, 20211010 and 20211017) are rolling forecast result of value, instead of 10 values for each column, I'll need to keep 3 values only. Here is the slicing rule: from left to right, from bottom to top to keep 3 values for each date column, so row 2021-11-28 from column 20211003 will be the starting point, and then increase day by day. The expected result will like this:

          date      value     20211003     20211010    20211017
0    2021-9-19  3613.9663          NaN          NaN         NaN
1    2021-9-26  3613.0673          NaN          NaN         NaN
2    2021-10-3  3568.1668          NaN          NaN         NaN
3   2021-10-10  3592.1666          NaN          NaN         NaN
4   2021-10-17  3572.3662          NaN          NaN         NaN
5   2021-10-24  3582.6036          NaN          NaN         NaN
6   2021-10-31  3547.3361          NaN          NaN         NaN
7    2021-11-7  3491.5677          NaN          NaN         NaN
8   2021-11-14  3539.1002          NaN          NaN         NaN
9   2021-11-21  3560.3734          NaN          NaN         NaN
10  2021-11-28  3564.0894  3255.328902          NaN         NaN
11   2021-12-5  3607.4320  3313.274702  3396.912886         NaN
12  2021-12-12  3666.3479  3371.220502  3450.172564  3412.23444
13  2021-12-19  3632.3638          NaN  3466.930383  3428.68349
14  2021-12-26  3618.0535          NaN          NaN  3370.73769

How could I achieve that in Pandas? Thanks.

Reference:

Iterate over multiple columns and replace the values in these columns after a row (increment) with null values

CodePudding user response:

df.iloc[:, :2].join(df.iloc[:, 2:].apply(lambda x:x.dropna().tail(3)))
 
          date      value     20211003     20211010    20211017
0    2021-9-19  3613.9663          NaN          NaN         NaN
1    2021-9-26  3613.0673          NaN          NaN         NaN
2    2021-10-3  3568.1668          NaN          NaN         NaN
3   2021-10-10  3592.1666          NaN          NaN         NaN
4   2021-10-17  3572.3662          NaN          NaN         NaN
5   2021-10-24  3582.6036          NaN          NaN         NaN
6   2021-10-31  3547.3361          NaN          NaN         NaN
7    2021-11-7  3491.5677          NaN          NaN         NaN
8   2021-11-14  3539.1002          NaN          NaN         NaN
9   2021-11-21  3560.3734          NaN          NaN         NaN
10  2021-11-28  3564.0894  3255.328902          NaN         NaN
11   2021-12-5  3607.4320  3313.274702  3396.912886         NaN
12  2021-12-12  3666.3479  3371.220502  3450.172564  3412.23444
13  2021-12-19  3632.3638          NaN  3466.930383  3428.68349
14  2021-12-26  3618.0535          NaN          NaN  3370.73769
  • Related