Home > Enterprise >  Regex matching a group that may or may not exist on string
Regex matching a group that may or may not exist on string

Time:01-28

I have a text which contains somewhere in the middle the following strings:

  1. Sat28 B158 RGX 1100 1200
  2. Sat28 Hoover 0005 RGX B158 RGX 1100 1200

I want to capture only the following groups:

  1. Sat28 B158 RGX 1100 1200
  2. Sat28 B158 RGX 1100 1200

To find the first option I've come up with the following regex:

(\w{3}\d{2})\s ([B]\d{3})\s (\w{3})\s (\d{4})\s (\d{4})

Example: https://regexr.com/772aq (Only the first option is included within the text)

However, when trying to find the second option as well, I've added a non-capturing group, which in turn skips the first option:

(\w{3}\d{2})\s (?:hoover\s \d{4}\s RGX\s )(\[B]\d{3})\s (\w{3})\s (\d{4})\s (\d{4})

Example: https://regexr.com/772at (Both options are included within the text, but only match the second option)

Basically, I added a non-capturing group (?:) that looks (and should ignore) the word hoover, blank space, 4 digits, blank space, the word RGX, blank space.

Any help is appreciated. Thanks!

Edit: text added to the regexr.com's examples for clarification.

CodePudding user response:

As I see you should just make the group optional (\w{3}\d{2})\s (?:hoover\s \d{4}\s RGX\s )?([B]\d{3})\s (\w{3})\s (\d{4})\s (\d{4}). Also not quite sure why you have escaped the [ just before the B but maybe it was a typo.

  • Related