Home > Net >  create a regex that extracts hourly numeric values and completes with other values if it detects abs
create a regex that extracts hourly numeric values and completes with other values if it detects abs

Time:07-25

This is my regex attempt, but it doesn't work correctly, and i don't know how to improve it to comply with this automatic time(hours and minutes) correction algorithm using REgex and if validations.

Keep in mind that there must be at most 2 numerical values in the group of hours, and the same will happen in the group of minutes.

Objetive output: "XX:XX am or pm"

regex = r"[\s0-9][\s*0-9]\s*(:|)\s*[\s0-9][\s*0-9]"

Operating examples, with some of the possible inputs that can be presented, and the considerations that must be taken with them to obtain the output format indicated there :

import re

input_text = "1 0 am"

add_left_zero = '0'

correction_time_1 = add_left_zero   '1'
correction_time_2 = add_left_zero   '2'

#this is what should be returned if the regex worked correctly
output = "01:00 am" #this is the objetive output "XX:XX am"
input_text= "01 2 pm"

add_left_zero = '0'

correction_time_1 = 01   12 = 13
correction_time_2 = add_left_zero   '2'


#this is what should be returned if the regex worked correctly
output = "13:02 pm" #this is the objetive output "XX:XX pm"

And if the first number of the hour is greater than 12 and different from 0 or 00, then put "pm" at the end of the string.

input_text = "20 55"

#this is what should be returned if the regex worked correctly
output = "20:55 pm"
input_text = "05 : 55"

#this is what should be returned if the regex worked correctly
output = "05:55 am"

And if "am" is indicated at the end of the string, but the number of hours is greater than 12 and different from 0, then the "am" is corrected by "pm" and 12 units are added to the number of hours

input_text = "21 : 55 am"

correction_time_1 = str(21 - 12)    # 21 - 12 = 9
correction_time_1 = add_left_zero   '9'
correction_time_2 = add_left_zero   '2'

#this is what should be returned if the regex worked correctly
output = "20:55 pm"

If a single number appears, you should only apply the same criteria assuming that this number only corresponds to the hours and not to the minutes

input_text = "1"

#this is what should be returned if the regex worked correctly
output = "01:00 am"
input_text = "1 pm"

#this is what should be returned if the regex worked correctly
output = "13:00 pm"
input_text = "15"

#this is what should be returned if the regex worked correctly
output = "15:00 pm"
input_text = "15 am"

#this is what should be returned if the regex worked correctly
output = "03:00 am"

How should my regex and algorithm be in order to extract the necessary data to validate the conditions and make the appropriate replacements to obtain the desired output string?

ACLARATION:

If the examples aren't enough, what you have to do is build a regex to extract the first numeric value or 2 numeric values followed by the : or an empty space, and then extract the numeric value or 2 following it. And then with a condition you have to verify that if the value is a number you have to put a 0 in front of it, and if not you don't have to do it. But as indicated in the last 2 examples, it may happen that one of the numerical values does not appear, so it must be assumed that this numerical value is equal to 00.

CodePudding user response:

Consider this regex:

pat = r'(\d{1,2})[\s|:]*(\d{0,2})\s*(am|pm])?'

This will:

  • capture one or two digits
  • ignore zero or more whitespace or colon chars
  • capture zero, one or two digits
  • ignore zero or more whitespace chars
  • capture zero or one occurrences of "am" or "pm"

For instance:

findall(pat,"1:15 am")
[('1', '15', 'am')]

findall(pat,"1  15")
[('1', '15', '')]

findall(pat,"0")
[('0', '', '')]

From here you can check the correctness of the input (you don't want 32:94) and build the desired output. Please note that once you have three variables, say hh, mm, am_pm, you can use an f-string:

hh = 2
mm = 0
am_pm = 'am'
print(f'{hh:02d}:{mm:02d} {am_pm}')
02:00 am

CodePudding user response:

It would help if you could be more specific of what you want this regex to do. More specifically what the end goal of this would be with more clear examples. As the current ones are unclear.

  • Related