I am trying to parse log files where some of them are single line logs, some are multiline. The regex I have works fine for single lines but not for multi-lines.
^(?<timestamp>\d -\d -\d T\d :\d :\d \.\d (\ |-)\d :\d )\s \[(?<severity>\w )\](?<message>.*)$
This is where the match is failing because it does not detect the string after the new line.
2022-06-27T15:22:35.508 00:00 [Info] New settings received:
{"indexer.settings.compaction.days_of_week":"Sunday,Monday"}
The new line should be included to the "message" group.
I tried multiple approaches to include the newline to be matched, but didn't find any solution yet. I have pasted both log formats in the link: https://regex101.com/r/ftJ3UZ/1.
Any help is very much appreciated!
CodePudding user response:
If a lookahead is supported, you can put an optional repeating group in the message
group checking that the next line does not start with a datelike pattern, or the full timestamp.
^(?<timestamp>\d -\d -\d T\d :\d :\d \.\d ([ -])\d :\d )\s \[(?<severity>\w )\](?<message>.*(?:\n(?!\d -\d -\d T).*)*)$
CodePudding user response:
It seems this would match:
^(?<timestamp>\d -\d -\d T\d :\d :\d \.\d (\ |-)\d :\d )\s \[(?<severity>\w )\](?<message>.*)\n(?:{.*})?
I've removed $
and added \n(?:{.*})?
to the end to be able to match optional part inside {}
braces.