I would like to create datetime
objects from a list of string timecodes like these. However, parse
interprets incorrectly for my use case.
from datetime import datetime
from dateutil import parser
timecodes = ['0:00', '0:01', '1:01', '10:01', '1:10:01']
dt = parser.parse(timecode)
print(dt)
The list above comes from YouTube's transcript timecodes. When copied from the site, they use a variable format to designate hours, minutes, and time, based on elapsed time:
0:00 # 0 minutes, 0 seconds
0:01 # 0 minutes, 1 seconds
1:01 # 1 minutes, 1 seconds
10:01 # 10 minutes, 1 seconds
1:10:01 # 1 hours, 10 minutes, 1 seconds
and parse
results in (comments are my interpretations):
2022-10-24 00:00:00 #0 minutes, 0 seconds
2022-10-24 00:01:00 #1 minutes, 0 seconds
2022-10-24 01:01:00 #1 hours, 1 minutes, 0 seconds
2022-10-24 10:01:00 #10 hours, 1 minutes, 0 seconds
2022-10-24 01:10:01 #1 hours, 10 minutes, 1 seconds
i.e. if a string doesn't consist of a full timecode including hours, minutes, seconds, then parse
appears to think that minutes are hours, and seconds are minutes.
How can I either dynamically parse the list to default interpretation to minutes & seconds instead of hours & minutes, or alternatively adjust the timecodes intelligently so that they conform to the parse
format?
CodePudding user response:
This is a little tricky but should work:
import datetime
timecodes = ['0:00', '0:01', '1:01', '10:01', '1:10:01']
zeroes = ['0','0','0']
dt = []
for i in timecodes:
sep = i.split(':')
sep = zeroes[:3-len(sep)] sep
dt.append(str(datetime.timedelta(seconds = sum([int(s) * 60**(2-sep.index(s)) for s in sep]))))
Output:
dt = ['0:00:00', '0:00:01', '0:01:01', '0:10:01', '1:10:01']
CodePudding user response:
another option is to map the duration components to integers in reverse order (seconds, minutes, hours), and convert to seconds by multiplication with the appropriate factors (1, 60, 3600) using zip
. sum
that up to get the total seconds, with you can convert to timedelta:
from datetime import timedelta
timecodes = ['0:00', '0:01', '1:01', '10:01', '1:10:01']
for t in timecodes:
print(
timedelta(seconds=sum(a*b for a, b in zip(map(int, t.split(":")[::-1]), (1, 60, 3600))))
)
0:00:00
0:00:01
0:01:01
0:10:01
1:10:01