I need to check if an entry is within a person's shift:
The data looks like this:
timestamp = pd.DataFrame({
'Timestamp': ['01/02/2022 16:08:56','01/02/2022 16:23:31','01/02/2022 16:41:35','02/02/2022 16:57:41','02/02/2022 17:38:22','02/02/2022 17:50:56'],
'Person': ['A','B','A','B','B','A']
})
shift = pd.DataFrame({
'Date': ['01/02/2022','02/02/2022','01/02/2022','02/02/2022'],
'in':['13:00:00','13:00:00','14:00:00','14:00:00'],
'out': ['21:00:00','21:00:00','22:00:00','22:00:00'],
'Person': ['A','A','B','B']
})
CodePudding user response:
For this kind of merge
, an efficient method is to use merge_asof
:
timestamp['Timestamp'] = pd.to_datetime(timestamp['Timestamp'])
(pd.merge_asof(timestamp.sort_values(by='Timestamp'),
shift.assign(Timestamp=pd.to_datetime(shift['Date'] ' ' shift['in']),
ts_out=pd.to_datetime(shift['Date'] ' ' shift['out']),
).sort_values(by='Timestamp')
[['Person', 'Timestamp', 'ts_out']],
on='Timestamp', by='Person'
)
.assign(in_shift=lambda d: d['ts_out'].ge(d['Timestamp']))
.drop(columns=['ts_out'])
)
output:
Timestamp Person in_shift
0 2022-01-02 16:08:56 A True
1 2022-01-02 16:23:31 B True
2 2022-01-02 16:41:35 A True
3 2022-02-02 16:57:41 B True
4 2022-02-02 17:38:22 B True
5 2022-02-02 17:50:56 A True
CodePudding user response:
I assume that there is only one shift per person per day.
First I split the day and time from the timestamp dataframe. Then merge this with the shift dataframe on columns Person
and Date
. Then we only need to check whether the time from timestamp is between in
and out
.
timestamp[['Date', 'Time']] = timestamp.Timestamp.str.split(' ', 1, expand=True)
df_merge = timestamp.merge(shift, on=['Date', 'Person'])
df_merge['Timestamp_in_shift'] = (df_merge.Time <= df_merge.out) & (df_merge.Time >= df_merge['in'])
df_merge.drop(columns=['Date', 'Time'])
Output:
Timestamp Person in out Timestamp_in_shift
0 01/02/2022 16:08:56 A 13:00:00 21:00:00 True
1 01/02/2022 16:41:35 A 13:00:00 21:00:00 True
2 01/02/2022 16:23:31 B 14:00:00 22:00:00 True
3 02/02/2022 16:57:41 B 14:00:00 22:00:00 True
4 02/02/2022 17:38:22 B 14:00:00 22:00:00 True
5 02/02/2022 17:50:56 A 13:00:00 21:00:00 True