Home > front end >  Find change in time in a pandas data frame
Find change in time in a pandas data frame

Time:09-17

I have the following lists extracted from a Pandas dataframe. Its basically a start date and time and an end date and time and I want to find the difference between them

start_date  = ['29.12.2020', '29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']

start_time = [datetime.time(11, 10), datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15)]

end_date = ['29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
end_time = [datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15), datetime.time(23, 15)]

So, I would like to join the dates and times to have one start point and one end point in time and find the difference between both of these in a HH:MM format.

For example, if we take the first row, the difference between both points in time should be 12:05 (12 hours and 5 minutes)

Thanks beforehand, I hope my explanation was clear :)

CodePudding user response:

start_date  = ['29.12.2020', '29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
start_time = [datetime.time(11, 10), datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15)]
end_date = ['29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
end_time = [datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15), datetime.time(23, 15)]

df = pd.DataFrame(data={'start_date': start_date,
                        'start_time': start_time,
                        'end_date': end_date,
                        'end_time': end_time})

df['start_date_time'] = pd.to_datetime(df['start_date']   ' '   df['start_time'].astype(str))
df['end_date_time'] = pd.to_datetime(df['end_date']   ' '   df['end_time'].astype(str))

df['diff'] = (df['end_date_time'] - df['start_date_time'])
df['hours'] = df['diff']/ np.timedelta64(1, 'h')
df['HH:MM'] = df['hours'].astype(int).astype(str)   ':'   ((df['diff']/ np.timedelta64(1, 'm')) - (df['hours'].astype(int)*60)).astype(int).astype(str)
print(df[['start_date_time', 'end_date_time', 'HH:MM']])

Output:

      start_date_time       end_date_time HH:MM
0 2020-12-29 11:10:00 2020-12-29 23:15:00  12:5
1 2020-12-29 23:15:00 2020-12-30 05:15:00   6:0
2 2020-12-30 05:15:00 2020-12-30 11:15:00   6:0
3 2020-12-30 11:15:00 2020-12-30 17:15:00   6:0
4 2020-12-30 17:15:00 2020-12-30 23:15:00   6:0

CodePudding user response:

You can try this one.

import datetime

start_date  = ['29.12.2020', '29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']

start_time = [datetime.time(11, 10), datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15)]

end_date = ['29.12.2020', '30.12.2020', '30.12.2020', '30.12.2020', '30.12.2020']
end_time = [datetime.time(23, 15), datetime.time(5, 15), datetime.time(11, 15), datetime.time(17, 15), datetime.time(23, 15)]


for start_d, start_t, end_d, end_t in zip(start_date, start_time,end_date, end_time):
    
    start_date_time = datetime.datetime.strptime(start_d, '%d.%m.%Y')
    start_date_time = start_date_time.replace(hour=start_t.hour, minute=start_t.minute)
        
    end_date_time = datetime.datetime.strptime(end_d, '%d.%m.%Y')
    end_date_time = end_date_time.replace(hour=end_t.hour, minute=end_t.minute)
    
    time_diff = end_date_time - start_date_time
    diff_str = f"{time_diff.days}days {time_diff.seconds//3600}hours {(time_diff.seconds//60)%60}minutes "
    print(diff_str)
  • Related