Home > Blockchain >  How can I check if a timestamp entry is within a time range and with a person filter in two differen
How can I check if a timestamp entry is within a time range and with a person filter in two differen

Time:08-27

I need to check if an entry is within a person's shift:

The data looks like this:

timestamp = pd.DataFrame({
    'Timestamp': ['01/02/2022 16:08:56','01/02/2022 16:23:31','01/02/2022 16:41:35','02/02/2022 16:57:41','02/02/2022 17:38:22','02/02/2022 17:50:56'],
    'Person': ['A','B','A','B','B','A']
})

shift = pd.DataFrame({
    'Date': ['01/02/2022','02/02/2022','01/02/2022','02/02/2022'],
    'in':['13:00:00','13:00:00','14:00:00','14:00:00'],
    'out': ['21:00:00','21:00:00','22:00:00','22:00:00'],
    'Person': ['A','A','B','B']
})

CodePudding user response:

For this kind of merge, an efficient method is to use merge_asof:

timestamp['Timestamp'] = pd.to_datetime(timestamp['Timestamp'])

(pd.merge_asof(timestamp.sort_values(by='Timestamp'),
               shift.assign(Timestamp=pd.to_datetime(shift['Date'] ' ' shift['in']),
                            ts_out=pd.to_datetime(shift['Date'] ' ' shift['out']),
                            ).sort_values(by='Timestamp')
                    [['Person', 'Timestamp', 'ts_out']],
               on='Timestamp', by='Person'
              )
   .assign(in_shift=lambda d: d['ts_out'].ge(d['Timestamp']))
   .drop(columns=['ts_out']) 
)

output:

            Timestamp Person  in_shift
0 2022-01-02 16:08:56      A      True
1 2022-01-02 16:23:31      B      True
2 2022-01-02 16:41:35      A      True
3 2022-02-02 16:57:41      B      True
4 2022-02-02 17:38:22      B      True
5 2022-02-02 17:50:56      A      True

CodePudding user response:

I assume that there is only one shift per person per day.

First I split the day and time from the timestamp dataframe. Then merge this with the shift dataframe on columns Person and Date. Then we only need to check whether the time from timestamp is between in and out.

timestamp[['Date', 'Time']] = timestamp.Timestamp.str.split(' ', 1, expand=True)

df_merge = timestamp.merge(shift, on=['Date', 'Person'])

df_merge['Timestamp_in_shift'] = (df_merge.Time <= df_merge.out) & (df_merge.Time >= df_merge['in'])

df_merge.drop(columns=['Date', 'Time'])

Output:

             Timestamp Person        in       out  Timestamp_in_shift
0  01/02/2022 16:08:56      A  13:00:00  21:00:00                True
1  01/02/2022 16:41:35      A  13:00:00  21:00:00                True
2  01/02/2022 16:23:31      B  14:00:00  22:00:00                True
3  02/02/2022 16:57:41      B  14:00:00  22:00:00                True
4  02/02/2022 17:38:22      B  14:00:00  22:00:00                True
5  02/02/2022 17:50:56      A  13:00:00  21:00:00                True
  • Related