To all the Regex gurus
Any idea how to handle this beast
string = 'Position_Name [ |-|/|*] PrevYear Position_Name'
Looking for the Regex to match the occurrences of Position_Name
(basically twice similar to a duplicate) but not really a dupe since it is followed by a special character and then by itself BUT with some prefix - here: 'PrevYear'. Means Position_Name
is dynamic and could be any word (eg Profit
, Sales
, etc) but PrevYear
will stay constant.
So how could I identify these lines where there's a position being mentioned twice with some math symbol in the middle (for now) and then capture those three elements since the plus could also be a /
(divided by), a minus sign -
or a multiply *
as intended to be represented by [ |-|/|*]
in my example.
PS: I do not mind programming this in two steps ... so first matching and then capturing - but still would need the regex to find these little gems (in hundreds of lines).
Elegantly finding dupes is not the problem eg via \b(\w ) \1\b
but I have come to realize my capabilities are not sufficient for that combo.
Thanks on hints and support.
CodePudding user response:
You can use
\b(\w )\b\s*[- /*]\s*PrevYear\s*\1\b
See the regex demo. Details
\b
- a word boundary(\w )
- Group 1: one or more word chars\b
- a word boundary\s*[- /*]\s*
- a-
,/
or*
enclosed with zero or more whitespacesPrevYear
- a fixed word\s*
- zero or more whitespaces\1
- same value as captured in Group 1\b
- a word boundary.