Home > Software engineering >  regex to replace based on specific last character
regex to replace based on specific last character

Time:07-30

I've below string in, here PT is pacific time, could other zone too. Number, hours and mins are more important in this text

PT45M
PT1H
PT30M
PT
PT2H

here last character is either M or H, that means hour or minute. And if nothing is present (PT), then it means its 0

how can get this output like below

45
60
30
0
120

getting confused with how to check all required checks in single regex

1 - check if it is having right pattern (PT<AnyNumber><H|M>)
2 - If yes, then check if last character is H or M
3 - if its H, then multiple by 60
4 - if point 1 is false, then return 0

any suggestions please

CodePudding user response:

You can use pattern PT(\d )([HM]) (Regex101):

import re

test_cases = ["PT45M", "PT1H", "PT30M", "PT", "PT2H"]

pat = re.compile(r"PT(\d )([HM])")

for t in test_cases:
    out, m = 0, pat.match(t)
    if m:
        out = int(m.group(1)) * (60 if m.group(2) == "H" else 1)
    print(t, out)

Prints:

PT45M 45
PT1H 60
PT30M 30
PT 0
PT2H 120

CodePudding user response:

It probably has its weaknesses, but this is the solution I just came up:

import re

times = ["PT45M", "PT1H", "PT30M", "PT", "PT2H"]

check_format_1 = r"(PT)(\d{0,2})([HM]{0,1})(\d{0,2})([HM]{0,1})"

for string in times:
    res = re.match(check_format_1, string)
    hours = (
        res.group(2)
        if res.group(3) != "M"
        else res.group(4)
        if res.group(5) != "M" and res.group(3) != "H"
        else "0"
    )
    if len(hours) == 0:
        hours = "0"

    minutes = (
        res.group(2)
        if res.group(3) != "H"
        else res.group(4)
        if res.group(5) != "H" and res.group(3) != "M"
        else "0"
    )
    if len(minutes) == 0:
        minutes = "0"

    print("-----")
    print(string)
    print(f"{hours} hours")
    print(f"{minutes} minutes")
    print(f"Total {int(minutes)   int(hours) * 60} minutes")

It returns this output:

-----
PT45M
0 hours
45 minutes
Total 45 minutes
-----
PT1H
1 hours
0 minutes
Total 60 minutes
-----
PT30M
0 hours
30 minutes
Total 30 minutes
-----
PT
0 hours
0 minutes
Total 0 minutes
-----
PT2H
2 hours
0 minutes
Total 120 minutes
>>> 

CodePudding user response:

I came up with:

import re
l1 = ['PT45M', 'PT1H', 'PT30M', 'PT', 'PT2H']
l2 = [eval(s[2:].replace('M','').replace('H','*60')) if re.match(r'^PT\d [HM]$', s) else 0 for s in l1]
print(l2)

Prints:

[45, 60, 30, 0, 120]

  • Loop all elements in an array using list comprehension;
  • Check if the elements match the pattern ^PT\d [HM]$;
  • If a match occurs then take substring from 3rd character onwards, replace 'M' with empty string and 'H' with '*60';
  • The result of the above is then calculated with eval() to return correct results;
  • If no match occurs return 0.
  • `
  • Related