Home > Blockchain >  How to get the timeslots from the start time and end time in the pandas dataframe
How to get the timeslots from the start time and end time in the pandas dataframe

Time:12-24

I have a pandas dataframe where it has start_time, end_time and booking duration.

Please find below example dataframe

id Start_time End_time Duration
1 2013-11-20 09:00:00 2013-11-20 09:30:00 0 days 0:30:00
2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 3:10:00
3 2013-11-20 10:00:00 2013-11-20 11:00:00 0 days 1:00:00
4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 1:40:00

I am trying to get the timeslots from the above dataframe

Expected output:

id Start_time End_time Duration Timeslots
1 2013-11-20 09:00:00 2013-11-20 09:30:00 0 days 0:30:00 9 - 10
2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 3:10:00 9-10
2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 3:10:00 10-11
2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 3:10:00 11-12
3 2013-11-20 10:00:00 2013-11-20 11:00:00 0 days 1:00:00 10 - 11
4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 1:40:00 8-9
4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 1:40:00 9-10

What I have tried till now

I can get the slots from start_time and end_time but i am missing the expected output

id Start_time End_time Duration TimeSlot
1 2013-11-20 09:00:00 2013-11-20 09:30:00 0 days 0:30:00 9-9:30
2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 3:10:00 9-12:10
3 2013-11-20 10:00:00 2013-11-20 11:00:00 0 days 1:00:00 10-11
4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 1:40:00 8 - 9:40

Can anyone give some hints please

CodePudding user response:

Try:

def get_slots(row):
    dti = pd.date_range(row['Start_time'].floor('H'), 
                        row['End_time'].ceil('H'), freq='H')
    return [f"{s.hour:02}-{e.hour:02}" for s, e in zip(dti, dti[1:])]
    
out = df.assign(Timeslots=df.apply(get_slots, axis=1)).explode('Timeslots')
print(out)

# Output:
   id          Start_time            End_time        Duration Timeslots
0   1 2013-11-20 09:00:00 2013-11-20 09:30:00 0 days 00:30:00     09-10
1   2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00     09-10
1   2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00     10-11
1   2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00     11-12
1   2 2013-11-20 09:00:00 2013-11-20 12:10:00 0 days 03:10:00     12-13
2   3 2013-11-20 10:00:00 2013-11-20 11:00:00 0 days 01:00:00     10-11
3   4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 01:40:00     08-09
3   4 2013-11-20 08:00:00 2013-11-20 09:40:00 0 days 01:40:00     09-10

Setup to be reproducible:

import pandas as pd
from pandas import Timestamp, Timedelta

data = {
'id': [1, 2, 3, 4],
'Start_time': [Timestamp('2013-11-20 09:00:00'), Timestamp('2013-11-20 09:00:00'), 
               Timestamp('2013-11-20 10:00:00'), Timestamp('2013-11-20 08:00:00')], 
'End_time': [Timestamp('2013-11-20 09:30:00'), Timestamp('2013-11-20 12:10:00'),
             Timestamp('2013-11-20 11:00:00'), Timestamp('2013-11-20 09:40:00')],
'Duration': [Timedelta('0 days 00:30:00'), Timedelta('0 days 03:10:00'),
             Timedelta('0 days 01:00:00'), Timedelta('0 days 01:40:00')]
}

df = pd.DataFrame(data)
  • Related