Home > other >  Issue doing greedy search using Python
Issue doing greedy search using Python

Time:04-22

This is my input text: "you have a choice between 1, 2 or 3 bedrooms"

I want to get the number of bedrooms, so one or more numbers before "bedroom" (allowing: ',', '-', 'and', '&', 'or', and 'whitespace' between numbers)

I have tried this pattern: (1|2|3|4|5|6|,|-|\s|&|and|or){1,12}bedroom on regex101 and it works fine.

But my Python code below, does not work:

text = "you have a choice between 1, 2 or 3 bedrooms"
number_range_pattern = r"(1|2|3|4|5|6|,|-|\s|&|and|or){1,12}"
bedrooms = re.search(number_range_pattern   r"bedroom", text)
if bedrooms and len(bedrooms.groups()) >= 1:
    match = bedrooms.group(1) # <-- match is a whitespace

Result: match is whitespce

I want the result to be: "1, 2 or 3"

CodePudding user response:

Here is a working solution:

text = "you have a choice between 1, 2 or 3 bedrooms"
m = re.search(r'\d (?:,? (?:(?:and|or|&) )?\d )*', text)
if m:
    print(m.group())  # 1, 2 or 3

The regex pattern here could use an explanation:

\d                    match a number
(?:
    ,?                optional comma separator
    [ ]               space
    (?:
        (?:and|or|&)  and, or, & conjunction
        [ ]           followed by space
    )?                and/or/& zero or one time
    \d                another number
)*                    zero or more times

CodePudding user response:

You need to print(bedrooms.group(0)) instead of bedrooms.group(1)

  • Related