I have been using regex101 to trouble shoot this statement:
This does stop at 2 digits, but drops the next line where the \d{2} starts:
/(?P<time>^(\d{2}(:|.)) )\s?(?P<Token>\[T\:\d \])?\s?(?P<event_type>\{\w \:\d \})?\s(?P<message>(.*)(?=\d{2}))/gm
This gets to the end of the line but never to the next lines without starting with the time:
(?P<time>^(\d{2}(:|.)) )\s?(?P<Token>\[T\:\d \])?\s?(?P<event_type>\{\w \:\d \})?\s(?P<message>((.*\n))(?=\d{2}\:))
This gets me to the end of the line but wont pick up the next lines not starting with the time:
/(?P<time>^(\d{2}(:|.)) )\s?(?P<Token>\[T\:\d \])?\s?(?P<event_type>\{\w \:\d \})?\s(?P<message>.*)/gm
I put the expected part in BOLD to be in the group 'message'
15:36:32.448 [T:1401135292433] {ScxmlMetric:3} **METRIC <log sid='~28~01TF8DKFD49SREES00000J' expr='~28~01TF8DKFD49SRE6Q9PE0C2LAES00000J: Inside Interaction Block: ScreenForPriorities' label='' level='2' />**
15:36:32.448 [T:1401135292433] {ScxmlMetric:1} **METRIC <extension sid='~283~01TF8DKFD49SRE6Q9PE0C2LAES0000' name='screen' namespace='http://www.slab.com/modules/classification' />**
15:36:32.448 ==>Connector::EventHandler Port=0 Proto=0 CallBack=<97446>
===>event: event_id=3, id=0 handle=66, datasize=24
15:36:32.448 {TSync:3} HandleThreadData: << 24 bytes <<
15:36:32.448 {ILink:3} **Message 'external_service_request' sent to 'I2P'
attr_ref_id [int] = 8165875
attr_envelope [list, size (unpacked)=369] =
'Version' [str] = "1.0"
'AppType' [int] = 90**
15:36:32.460 {SManager:1} **[IX]: >> GET >> (FMID=000ADaHVQEJC00 / SESSID=~28~01TF8DKFD49SRE6Q9PE0C2LAES0)**
message1
METRIC <log sid='~28~01TF8DKFD49SREES00000J' expr='~28~01TF8DKFD49SRE6Q9PE0C2LAES00000J: Inside Interaction Block: ScreenForPriorities' label='' level='2' />
message2
METRIC <extension sid='~283~01TF8DKFD49SRE6Q9PE0C2LAES0000' name='screen' namespace='http://www.slab.com/modules/classification' />
message3
==>Connector::EventHandler Port=0 Proto=0 CallBack=<97446>
===>event: event_id=3, id=0 handle=66, datasize=24
message4
Message 'external_service_request' sent to 'I2P
attr_ref_id [int] = 8165875
attr_envelope [list, size (unpacked)=369] =
'Version' [str] = "1.0"
'AppType' [int] = 90
message5
[IX]: >> GET >> (FMID=000ADaHVQEJC00 / SESSID=~28~01TF8DKFD49SRE6Q9PE0C2LAES0)
CodePudding user response:
You could try this pattern:
/(?P<time>\d{2}:\d{2}:\d{2}.\d{3})\s?(?P<token>\[T\:\d \])?\s?(?P<event_type>\{\w \:\d \})?\s?(?P<message>(.|\n)*?)(?=\d{2}:|$)/g
Demo: https://regex101.com/r/zGyuMy/1
CodePudding user response:
In the example data there seem to be 6 messages that would match the pattern, including:
15:36:32.448 {TSync:3} HandleThreadData: << 24 bytes <<
You might use:
^(?P<time>\d{1,2}:\d{1,2}:\d{1,2}\.\d{3})(?:\s (?P<token>\[T:\d ]))?(?:\s (?P<event_type>\{\w :\d }))?\s (?P<message>.*(?:\n(?!\d{1,2}:).*)*)
Explanation
^
Start of string(?P<time>\d{1,2}:\d{1,2}:\d{1,2}\.\d{3})
Named grouptime
, match a time like pattern(?:\s (?P<token>\[T:\d ]))?
Optional non capture group with named grouptoken
, match 1 whitespace chars[T
1 digits and]
(?:\s (?P<event_type>\{\w :\d }))?
Optional non capture group with named groupevent_type
, match 1 whitespace chars{
1 word chars:
1 digits and the}
\s
Match 1 whitespace chars(?P<message>
Named groupmessage
.*
Match the whole line(?:
Non capture group to repeat as a whole part\n
Match a newline(?!\d{1,2}:)
Negative lookahead, assert the the line does not start with 1-2 digits and:
.*
Match the whole line
)*
Close the non capture group and optionally repeat to match all lines
)
Close groupmessage
See a regex101 demo.
CodePudding user response:
I took another look, this works by lookahead for the first group time.
I aslo added a few of these (\[T\:)?
(\]\s{)?
(\}\s)?
to leave just the value for the group
(?P<time>^(\d{2}\:\d{2}\:\d{2}.\d{3}))\s(\[T\:)?(?P<token>\d{15})?(\]\s{)?(?P<event_type>.*\:\d)?(\}\s)?(?P<message>[\s\S]*?)(?=(?1))