What I'm trying to achieve Given the three strings below, I am trying to match the Y tf toward the beginning of each string.
15 YstfAWIN25 desired matches -> Y tf
15 YstfMSIN25 desired matches -> Y tf
15 Ystf20IN25 desired matches -> Y tf
This regular expression ([ftY]) meets my objectives, however, it is too brittle and would yield erroneous results with minor changes to the string. For example, 15 YstfMYIN25 would result in the following match Y tf Y - I don't want to match that second Y.
As a result, I tried using a non-capture group to limit the characters that would be matched.
([ftY])(?:AW|MS|\d )
This regular expression yields the following match when a second Y (15 YstfMYIN25) is included in the full string:
15 YstfMYIN25 desired matches -> f
The addition of the capture group made expression skip over the Y and t. I did play around with making the capture group greedy and the non-capture group lazy, but I got the same result. Is there a way to use a non-capture group (or otherwise) to limit the characters that can be captured and still capture all the characters of interest? In this exmaple Y ft only.
I have some examples below:
https://regex101.com/r/EDPqsl/1 https://regex101.com/r/R1tiXz/1
CodePudding user response:
You can use
^.*?([ftY]).*?(?!\1)([ftY]).*?(?!\1|\2)([ftY])
See the regex demo. The three letters will land into three separate capturing groups.
Details:
^
- start of string.*?
- any zero or more chars other than line break chars, as few as possible([ftY])
- Group 1:f
,t
orY
.*?
- any zero or more chars other than line break chars, as few as possible(?!\1)([ftY])
- Group 2:f
,t
orY
, but not the value captured in Group 1.*?
- any zero or more chars other than line break chars, as few as possible(?!\1|\2)([ftY])
- Group 3:f
,t
orY
, but not the value captured in Groups 1 and 2.