I'm working with regex on PRCE2 environment.
In my switch logs I have to capture a text string that I'm capturing as "message"
and that is located in a specific position. The focus point is that it is always preceded by a set of characters ending with :
but, after them, I can have or not some addictional characters ending with ;
and I must be able to skip them.
Let me explain with my current regex and some log samples.
We can say that I have 3 chances:
1. (s)[18014]:Recorded command information.
2. (l):User logged out.
3. (s)[18014]:CID=0x11aa2222;The user succeeded in logging out of XXX.
My current regex is:
\(\w \)\[*\d*\]*\:(?<message>[^\[] ?\.)
that works for case 1 and 2 because:
- capture the fact that we always have a (, a literal character and a ) with
\(\w \)
- capture, as in case 2, if after that we have a [, a number and a ] with
\[*\d*\]*
- in every case the following characters are
:
and I capture it with\:
- The message is captured, and named, with
(?<message>[^\[] ?\.)
that must avoid the capturing action if, after:
, I have a[
. The capture stops when when I get a.
My problem is: after the :
I can have the case 3; it always begin with CID=<exadecimal expression>;
but it is not only limited to this. After it, I can have other expression always ended by ;
So we can say that I can have, for case 3, CID=<hex expression><other numeric and literal characters>;
.
With current regex, of course, the CIDR
part is included in the message. I must avoid it; if the CIDR
part is present, the message capture must start after the ;
that end it.
So, we can summarize that:
IF after the :
we have no CIDR word, starts capturing; ELSE, avoid capturing until ;
and start the job after it.
CodePudding user response:
The following pattern will match the right part of your test strings.
We look for either a :
not followed by CID ?!CID
or a ;
. We then capture what follows.
((:(?!CID))|;)(.*)
see https://regex101.com/r/JRB4Rq/1
CodePudding user response:
You could write the pattern as:
\(\w \)(?:\[\d \])?:(?:CID=[^;]*;)?(?<message>[^.] \.)
Explanation
\(\w \)
Match 1 word chars between parenthesis(?:\[\d \])?
Optionally match 1 digits between square brackets:
Match the colon (you don't have to escape it)(?:CID=[^;]*;)?
Optionally match the CID= part till the first semicolon(?<message>[^.] \.)
Group message, match 1 chars other than.
and then match the.
See a regex demo.