Home > database >  Regular expression using non-greedy matching -- confusing result
Regular expression using non-greedy matching -- confusing result

Time:03-24

I thought I understood how the non-greedy modifier works, but am confused by the following result:

  • Regular Expression: (,\S ?)_sys$

  • Test String: abc,def,ghi,jkl_sys

  • Desired result: ,jkl_sys <- last field including comma

  • Actual result: ,def,ghi,jkl_sys

Use case is that I have a comma separated string whose last field will end in "_sys" (e.g. ,sometext_sys). I want to match only the last field and only if it ends with _sys.

I am using the non-greedy (?) modifier to return the shortest possible match (only the last field including the comma), but it returns all but the first field (i.e. the longest match).

What am I missing?

I used https://regex101.com/ to test, in case you want to see a live example.

CodePudding user response:

You can use

,[^,] _sys$

The pattern matches:

  • , Match the last comma
  • [^,] Match 1 occurrences of any char except ,
  • _sys Match literally
  • $ End of string

See a regex demo.

If you don't want to match newlines and whitespaces:

,[^\s,] _sys$

CodePudding user response:

It sounds like you're looking for the a string that ends with "_sys" and it has to be at the end of the source string, and it has to be preceded by a comma.

,\s*(\w _sys)$

I added the \s* to allow for optional whitespace after the comma.

No non-greedy modifiers necessary.

The parens are around \w _sys so you can capture just that string, without the comma and optional whitespace.

  • Related