I have the following dataframe
DATETIME,TYPE
2021-10-13 18:04:52,NaN
2021-10-13 18:04:53,NaN
2021-10-13 18:04:54,NaN
2021-10-13 18:04:55,NaN
2021-10-13 18:04:56,NaN
2021-10-13 18:04:57,NaN
2021-10-13 18:04:58,Defect
2021-10-13 18:04:59,NaN
2021-10-13 18:05:00,NaN
2021-10-13 18:05:01,NaN
2021-10-13 18:05:02,NaN
2021-10-13 18:05:03,NaN
2021-10-13 18:05:04,NaN
2021-10-13 18:05:05,NaN
2021-10-13 18:05:06,NaN
2021-10-13 18:05:07,NaN
2021-10-13 18:05:08,NaN
2021-10-13 18:05:09,NaN
2021-10-13 18:05:10,Defect
2021-10-13 18:05:11,NaN
2021-10-13 18:05:12,NaN
2021-10-13 18:05:13,NaN
2021-10-13 18:05:14,NaN
2021-10-13 18:05:15,NaN
2021-10-13 18:05:16,NaN
2021-10-13 18:05:17,NaN
2021-10-13 18:05:18,NaN
2021-10-13 18:05:19,NaN
2021-10-13 18:05:20,NaN
2021-10-13 18:05:21,NaN
And you can see on 18:04:58 and 18:05:10, there is Defect, how can I write to the previous 18:04:57 and the next 18:04:59 as Defect as-well? The same goes for 18:05:09 and 18:05:11.
I've tried using creating a definition and using apply, however, apply is not possible because it is passed as a string instead of an array.
Desired output:
2021-10-13 18:04:52,NaN
2021-10-13 18:04:53,NaN
2021-10-13 18:04:54,NaN
2021-10-13 18:04:55,NaN
2021-10-13 18:04:56,NaN
2021-10-13 18:04:57,Defect
2021-10-13 18:04:58,Defect
2021-10-13 18:04:59,Defect
2021-10-13 18:05:00,NaN
2021-10-13 18:05:01,NaN
2021-10-13 18:05:02,NaN
2021-10-13 18:05:03,NaN
2021-10-13 18:05:04,NaN
2021-10-13 18:05:05,NaN
2021-10-13 18:05:06,NaN
2021-10-13 18:05:07,NaN
2021-10-13 18:05:08,NaN
2021-10-13 18:05:09,Defect
2021-10-13 18:05:10,Defect
2021-10-13 18:05:11,Defect
2021-10-13 18:05:12,NaN
2021-10-13 18:05:13,NaN
2021-10-13 18:05:14,NaN
2021-10-13 18:05:15,NaN
2021-10-13 18:05:16,NaN
2021-10-13 18:05:17,NaN
2021-10-13 18:05:18,NaN
2021-10-13 18:05:19,NaN
2021-10-13 18:05:20,NaN
2021-10-13 18:05:21,NaN
CodePudding user response:
Try shift
and loc
assignment:
df.loc[df['TYPE'].shift().eq('Defect') | df['TYPE'].shift(-1).eq('Defect'), 'TYPE'] = 'Defect'
CodePudding user response:
You can use combine_first
twice:
df['TYPE'] = df['TYPE'].combine_first(df['TYPE'].shift()) \
.combine_first(df['TYPE'].shift(-1))
print(df)
# Output:
DATETIME TYPE
0 2021-10-13 18:04:52 NaN
1 2021-10-13 18:04:53 NaN
2 2021-10-13 18:04:54 NaN
3 2021-10-13 18:04:55 NaN
4 2021-10-13 18:04:56 NaN
5 2021-10-13 18:04:57 Defect
6 2021-10-13 18:04:58 Defect
7 2021-10-13 18:04:59 Defect
8 2021-10-13 18:05:00 NaN
9 2021-10-13 18:05:01 NaN
10 2021-10-13 18:05:02 NaN
11 2021-10-13 18:05:03 NaN
12 2021-10-13 18:05:04 NaN
13 2021-10-13 18:05:05 NaN
14 2021-10-13 18:05:06 NaN
15 2021-10-13 18:05:07 NaN
16 2021-10-13 18:05:08 NaN
17 2021-10-13 18:05:09 Defect
18 2021-10-13 18:05:10 Defect
19 2021-10-13 18:05:11 Defect
20 2021-10-13 18:05:12 NaN
21 2021-10-13 18:05:13 NaN
22 2021-10-13 18:05:14 NaN
23 2021-10-13 18:05:15 NaN
24 2021-10-13 18:05:16 NaN
25 2021-10-13 18:05:17 NaN
26 2021-10-13 18:05:18 NaN
27 2021-10-13 18:05:19 NaN
28 2021-10-13 18:05:20 NaN
29 2021-10-13 18:05:21 NaN