I would like to split a string based on a delimiter and ignore a particular pattern. I have lines in a text file that look like so
"ABC | 0 | 567 | my name is | however
TQD | 0 | 567 | my name is | but
GED | 0 | 567 | my name is | haha"""
I would like to split on "|" but ignore 0 and 567 and grab the rest. i.e
['ABC', 'my name is', 'however']
['TQD', 'my name is', 'but']
['GED', 'my name is', 'haha']
whenever I split, its grabbing the two numbers as well. now numbers can occur in other places, but this particular pattern of |0|567| needs to be ignored. I can obviously split on "|" and pop the element at index 1 and 2. but looking for a better way.
I tried this:
import re
pattern = re.compile(r'\|(?!0|567)')
pattern.split(line)
this yields [ABC|0|567, my name is, however]
CodePudding user response:
To include the |
specific numbers |
in the split sequence:
pattern = re.compile(r' *\|(?: *(?:0|567) *\|)* *')
See this demo at regex101 or a Python demo at tio.run
The (?:
non capturing groups )
is repeated *
any amount of times.