Home > database >  Parse string to convert time in hours, minutes and seconds to only seconds
Parse string to convert time in hours, minutes and seconds to only seconds


YouTube video durations are given in the form PT1H15M9S.

I need to parse this string to convert the duration to seconds.

I tried the following code:

import re

def to_seconds(string=None):
    if 'H' in string:
        h = string.index('H')
        if ''
        m = string.index('M')
        secs = re.sub('[^0-9]','',string[-3:])
        mins = re.sub('[^0-9]','',string[m-2:m 1])
        hours = re.sub('[^0-9]','',string[h-2:h 1])
    elif 'M' in string:
        m = string.index('M')
        secs = re.sub('[^0-9]','',string[-3:])
        mins = re.sub('[^0-9]','',string[m-2:m 1])
        hours = 0
        secs = re.sub('[^0-9]','',string[-3:])
        mins, hours = 0, 0
    return int(hours) * 60 * 60   int(mins) * 60   int(secs)

However, this code does not always work because there are strings that contain hours but not minutes or seconds and so on.

For example PT1H15S, or PT1H, or PT12H4M.

How can I get this code to work in such cases?

CodePudding user response:

Sample and explanation of terms: https://regex101.com/r/l8cNAP/2


r"PT(?:(?P<h>\d )H)?(?:(?P<m>\d )M)?(?:(?P<s>\d )S)?"

Query the resulting matchobject/dict with 0 as default if key is missing

CodePudding user response:

import re

apiDuration = 'PT1H15M9S'
regex = r"PT(?:(?P<hours>\d*)H)?(?:(?P<minutes>\d*)M)?(?:(?P<seconds>\d*)S)?"

parsedDuration = re.match(regex, apiDuration)

print(parsedDuration.groups()) # ('1', '15', '9')
print(parsedDuration.group('hours'))  # 1
print(parsedDuration.group('minutes'))  # 15
print(parsedDuration.group('seconds'))  # 9


  • Related