Home > front end >  Pattern to match everything except a string of 5 digits
Pattern to match everything except a string of 5 digits

Time:07-06

I only have access to a function that can match a pattern and replace it with some text:

Syntax
regexReplace('text', 'pattern', 'new text'

And I need to return only the 5 digit string from text in the following format:

CRITICAL - 192.111.6.4: rta nan, lost 100%
Created Time    Tue, 5 Jul 8:45
Integration Name    CheckMK Integration
Node    192.111.6.4
Metric Name POS1
Metric Value    DOWN
Resource    54871
Alert Tags  54871, POS1

So from this text, I want to replace everything with "" except the "54871".

I have come up with the following:

regexReplace("{{ticket.description}}", "\w*[^\d\W]\w*", "")

Which almost works but it doesn't match the symbols. How can I change this to match any word that includes a letter or symbol, essentially.

enter image description here

As you can see, the pattern I have is very close, I just need to include special characters and letters, whereas currently it is only letters:

enter image description here

CodePudding user response:

You can match the whole string but capture the 5-digit number into a capturing group and replace with the backreference to the captured group:

regexReplace("{{ticket.description}}", "^(?:[\w\W]*\s)?(\d{5})(?:\s[\w\W]*)?$", "$1")

See the regex demo.

Details:

  • ^ - start of string
  • (?:[\w\W]*\s)? - an optional substring of any zero or more chars as many as possible and then a whitespace char
  • (\d{5}) - Group 1 ($1 contains the text captured by this group pattern): five digits
  • (?:\s[\w\W]*)? - an optional substring of a whitespace char and then any zero or more chars as many as possible.
  • $ - end of string.

CodePudding user response:

The easiest regex is probably:

^(.*\D)?(\d{5})(\D.*)?$

You can then replace the string with "$2" ("\2" in other languages) to only place the contents of the second capture group (\d{5}) back.

The only issue is that . doesn't match newline characters by default. Normally you can pass a flag to change . to match ALL characters. For most regex variants this is the s (single line) flag (PCRE, Java, C#, Python). Other variants use the m (multi line) flag (Ruby). Check the documentation of the regex variant you are using for verification.

However the question suggest that you're not able to pass flags separately, in which case you could pass them as part of the regex itself.

(?s)^(.*\D)?(\d{5})(\D.*)?$

regex101 demo

  • (?s) - Set the s (single line) flag for the remainder of the pattern. Which enables . to match newline characters ((?m) for Ruby).
  • ^ - Match the start of the string (\A for Ruby).
  • (.*\D)? - [optional] Match anything followed by a non-digit and store it in capture group 1.
  • (\d{5}) - Match 5 digits and store it in capture group 2.
  • (\D.*)? - [optional] Match a non-digit followed by anything and store it in capture group 3.
  • $ - Match the end of the string (\z for Ruby).

This regex will result in the last 5-digit number being stored in capture group 2. If you want to use the first 5-digit number instead, you'll have to use a lazy quantifier in (.*\D)?. Meaning that it becomes (.*?\D)?.

(?s) is supported by most regex variants, but not all. Refer to the regex variant documentation to see if it's available for you.

An example where the inline flags are not available is JavaScript. In such scenario you need to replace . with something that matches ALL characters. In JavaScript [^] can be used. For other variants this might not work and you need to use [\s\S].

With all this out of the way. Assuming a language that can use "$2" as replacement, and where you do not need to escape backslashes, and a regex variant that supports an inline (?s) flag. The answer would be:

regexReplace("{{ticket.description}}", "(?s)^(.*\D)?(\d{5})(\D.*)?$", "$2")
  • Related