I have a binary DataFrame df1
with ones and zeros and I would like to forward fill ones for 2 additional days for each column as in df2
:
df1
Date ID1 ID2 ID3
0 2021-01-01 0 1 0
1 2021-01-02 0 0 0
2 2021-01-03 1 0 1
3 2021-01-04 0 0 0
4 2021-01-05 0 0 1
5 2021-01-06 0 0 0
6 2021-01-07 0 0 0
7 2021-01-08 0 0 0
df2
Date ID1 ID2 ID3
0 2021-01-01 0 1 0
1 2021-01-02 0 1 0
2 2021-01-03 1 1 1
3 2021-01-04 1 0 1
4 2021-01-05 1 0 1
5 2021-01-06 0 0 1
6 2021-01-07 0 0 1
7 2021-01-08 0 0 0
For reproducibility:
import pandas as pd
df1 = pd.DataFrame({
'Date':['2021-01-01', '2021-01-02', '2021-01-03', '2021-01-04', '2021-01-05', '2021-01-06', '2021-01-07', '2021-01-08'],
'ID1':[0,0,1,0,0,0,0,0],
'ID2':[1,0,0,0,0,0,0,0],
'ID3':[0,0,1,0,1,0,0,0]})
Is there an efficient and easy way to deal with that? Thanks a lot.
CodePudding user response:
As you have no missing days, use shift
:
out = df1.set_index('Date').astype(bool)
out = (out | out.shift(1, fill_value=False) | out.shift(2, fill_value=False)) \
.astype(int).reset_index()
print(out)
# Output:
Date ID1 ID2 ID3
0 2021-01-01 0 1 0
1 2021-01-02 0 1 0
2 2021-01-03 1 1 1
3 2021-01-04 1 0 1
4 2021-01-05 1 0 1
5 2021-01-06 0 0 1
6 2021-01-07 0 0 1
7 2021-01-08 0 0 0
Update
In case I also want to assign a one before, how would I need to adjust?
>>> df1.set_index('Date').rolling(4, min_periods=1, center=True).max() \
.astype(int).reset_index()
Date ID1 ID2 ID3
0 2021-01-01 0 1 0
1 2021-01-02 1 1 1
2 2021-01-03 1 1 1
3 2021-01-04 1 0 1
4 2021-01-05 1 0 1
5 2021-01-06 0 0 1
6 2021-01-07 0 0 1
7 2021-01-08 0 0 0
CodePudding user response:
Just apply a rolling
with max
:
df1.set_index('Date').rolling(3, min_periods=1).max().convert_dtypes().reset_index()
If there can be missing dates:
(df1.assign(Date=pd.to_datetime(df1['Date']))
.set_index('Date')
.rolling('3d')
.max()
.convert_dtypes()
.reset_index()
)
output:
Date ID1 ID2 ID3
0 2021-01-01 0 1 0
1 2021-01-02 0 1 0
2 2021-01-03 1 1 1
3 2021-01-04 1 0 1
4 2021-01-05 1 0 1
5 2021-01-06 0 0 1
6 2021-01-07 0 0 1
7 2021-01-08 0 0 0