Home > Blockchain >  pandas shifting missing months
pandas shifting missing months

Time:08-12

let's assume the following dataframe and shift operation:

d = {'col1': ['2022-01-01','2022-02-01','2022-03-01','2022-05-01'], 'col2': [1,2,3,4]}
df = pd.DataFrame(d)
df['shifted'] = df['col2'].shift(1, fill_value=0)

I want to create a column containing the values of the month before and filling it up for months which do not exist with 0, so the desired result would look like:

col1 col2 shifted
2022-01-01 1 0
2022-02-01 2 1
2022-03-01 3 2
2022-05-01 4 0

So in the last line the value is 0 because there is no data for April.


But at the moment it looks like this:

col1 col2 shifted
2022-01-01 1 0
2022-02-01 2 1
2022-03-01 3 2
2022-05-01 4 3

Does anyone know how to achieve this?

CodePudding user response:

One idea is create month PeriodIndex, so possible shift by months, last replace missing values:

df = df.set_index(pd.to_datetime(df['col1']).dt.to_period('m'))
df['shifted'] = df['col2'].shift(1, freq='m').reindex(df.index, fill_value=0)

print (df)
               col1  col2  shifted
col1                              
2022-01  2022-01-01     1        0
2022-02  2022-02-01     2        1
2022-03  2022-03-01     3        2
2022-05  2022-05-01     4        0

Last is possible remove PeriodIndex:

df = df.reset_index(drop=True)
print (df)
         col1  col2  shifted
0  2022-01-01     1        0
1  2022-02-01     2        1
2  2022-03-01     3        2
3  2022-05-01     4        0
  • Related