How to make sure that part of the pattern (keyword in this case) is in the pattern you're looking for, but it can appear in different places. I want to have a match only when it occurs at least once.
Regex:
\b(([0-9])(xyz)?([-]([0-9])(xyz)?)?)\b
we only want the value if there is a keyword: xyz
Examples:
1. 1xyz-2xyz - it's OK
2. 1-2xyz - it's OK
3. 1xyz - it's OK
4. 1-2 - there should be no match, at least one xyz missing
I try positive lookup and lookbehind but this is not working in this case
CodePudding user response:
You can make use of a conditional construct:
\b([0-9])(xyz)?(?:-([0-9])(xyz)?)?\b(?(2)|(?(4)|(?!)))
See the regex demo. Details:
\b
- word boundary([0-9])
- Group 1: a digit(xyz)?
- Group 2: an optionalxyz
string(?:-([0-9])(xyz)?)?
- an optional sequence of a-
, a digit (Group 3),xyz
optional char sequence\b
- word boundary(?(2)|(?(4)|(?!)))
- a conditional: if Group 2 (first(xyz)?
) matched, it is fine, return the match, if not, check if Group 4 (second(xyz)?
) matched, and return the match if yes, else, fail the match.
See the Python demo:
import re
text = "1. 1xyz-2xyz - it's OK\n2. 1-2xyz - it's OK\n3. 1xyz - it's OK\n4. 1-2 - there should be no match"
pattern = r"\b([0-9])(xyz)?(?:-([0-9])(xyz)?)?\b(?(2)|(?(4)|(?!)))"
print( [x.group() for x in re.finditer(pattern, text)] )
Output:
['1xyz-2xyz', '1-2xyz', '1xyz']
CodePudding user response:
Try this: \b(([0-9])?(xyz) ([-]([0-9]) (xyz) )?)\b
Replace ?
with
Basically ?: zero or more and in your case you want to match one or more.
Whih is