Home > Mobile >  How to find numbers following specific punctuation
How to find numbers following specific punctuation

Time:10-05

Text:

3. MANAGEMENT, FOOD EMPLOYEE  Comments 234:  FOUND NO EMPLOYEE  ISSUED. | 5. PROCEDURES FOR RESPONDING TO VOMITING AND DIARRHEAL EVENTS - Comments:   | 10. ADEQUATE HANDWASHING SINKS 7-38-030(C), NO CITATION ISSUED.  | 47. FOOD & NON-FOOD 

Background: inputs are separated by |, the goal is to find the first number and all numbers after punctuation |

The preferred outcome is = [3,5,10,47]

Notice: avoiding 234,7-38-030

CodePudding user response:

def multi_re_find(patterns,phrase):
    '''
    Takes in a list of regex patterns
    Prints a list of all matches
    '''
    for pattern in patterns:
        print ('Searching the phrase using the re check: %r' %pattern)
        print (re.findall(pattern,phrase))
        print ('\n')
test_patterns = [r'\W ''\d ']
multi_re_find(test_patterns,text)

Outcome:[' 234', '. | 5', ': | 10', ' 7', '-38', '-030', '. | 47']

My approach missed number 3 and miss-included 234,7,-38,-30 into the result.

CodePudding user response:

You can use a regex with a lookahead:

s = '3. MANAGEMENT, FOOD EMPLOYEE  Comments 234:  FOUND NO EMPLOYEE  ISSUED. | 5. PROCEDURES FOR RESPONDING TO VOMITING AND DIARRHEAL EVENTS - Comments:   | 10. ADEQUATE HANDWASHING SINKS 7-38-030(C), NO CITATION ISSUED.  | 47. FOOD & NON-FOOD'

import re

re.findall('\d (?=\.)', s)

Or to ensure matching the beginning of the line or after a | you can also add a lookbehind:

re.findall('(?:(?<=^)|(?<=\| ))\d (?=\.)', s)

Output:

['3', '5', '10', '47']

And to get a list of integers:

list(map(int, re.findall('\d (?=\.)', s)))
  • Related