I want to find any word/characters, but not:
apple
apple 3
this is the string:
orange lemon 2 apple 3 pear
I tried this pattern but it didnt work:
\b(?!apple|apple \d )\b\S
CodePudding user response:
You could use re.findall
here to find all terms, followed by a list comprehension to filter that list:
inp = "orange lemon 2 apple 3 pear"
terms = re.findall(r'\w (?: \d )?', inp)
output = [x for x in terms if not re.search(r'^apple \d ', x)]
print(output) # ['orange', 'lemon 2', 'pear']
CodePudding user response:
Another way is to use this trick to match what you don't want but capture what you need.
pattern = r'\b(?:apple(?: \d )?|(\S ))\b'
See this demo at regex101 (more explanation on right side) -
Use with findall
...
Return all non-overlapping matches of pattern in string, as a list of strings or tuples... matches are returned in the order found. Empty matches are included in the result...
If there is exactly one group, return a list of strings matching that group.
So the empty matches need to be removed from the result.
res = [x for x in re.findall(pattern, my_str) if x]
And here is the Python demo at tio.run