Home > Mobile >  Regex Capture group skip escaped strings
Regex Capture group skip escaped strings

Time:10-28

I am currently working with Splunk to go through some of my logs and I ran into some regex issues. Right now I have logs in a few different formats. When I constructed the regex I was dumping a JsonConvert.Serialize() to my log which would just dump the json version of my objects which worked well. Now however I am dumping just text and I can't seem to match the regex to my capture group.

\{\"line\":\"(?<time>. )\|(?<log_level>. )\|(?<Controller>. )\|(?<Message>. )\}\"

My current RegEx is above and it matches the first two. Note how it ends with curly brace

{"line":"18:48:17.990|INFO|PController|Plex event is media.pause}","source":"stdout","tag":"4e263fa2001d"}

{"line":"22:38:47.839|INFO|PController|{\"Id\":\"SMf1bc2466b1\",\"ErrorMessages\":null}","source":"stdout","tag":"b5fcd8b8b5a4"}

{"line":"22:38:47.839|INFO|PController|This is another test","source":"stdout","tag":"b5fcd8b8b5a4"}

{"line":"18:56:37.212|INFO|PController|media.stop","source":"stdout","tag":"4e263fa2001d"}

Basically the regex should parse the json from the log and pull the fields out into the specified capture groups. It works well for log #1, and log #2, but it fails on log #3 and log #4 because it doesn't have the ending curly bracket. ogregex

I also tried {"line":"(?. )|(?<log_level>. )|(?. )|(?. )",

but this matches till the end of the "source":"stdout" and not the end of my line tag. enter image description here

Any help would be greatly appreciated, I am trying to enable the "Message" capture group to have any characters possible which includes quotes, curly brackets, and basically any special character. I'm just trying to pull the Time, LogLevel, Controller, and Message from the full Json string.

Thank you!

CodePudding user response:

If I understand correctly, your group "Message" can either contain a curly bracket with additional quotes or no curly bracket.
You then have to check for both possibilities in your regex:

\{\"line\":\"(?<time>. )\|(?<log_level>. )\|(?<Controller>. )\|(?<Message>. \}|. ?)\"

Note that in the case without curly bracket we use a ? in order to stop at the next quote (lazy expression instead of greedy)
Demo

  • Related