Home > OS >  I would like regex to pick all the lines into the <message> group until the next line that sta
I would like regex to pick all the lines into the <message> group until the next line that sta

Time:01-30

I have been using regex101 to trouble shoot this statement:

This does stop at 2 digits, but drops the next line where the \d{2} starts:

/(?P<time>^(\d{2}(:|.)) )\s?(?P<Token>\[T\:\d \])?\s?(?P<event_type>\{\w \:\d \})?\s(?P<message>(.*)(?=\d{2}))/gm

This gets to the end of the line but never to the next lines without starting with the time:

(?P<time>^(\d{2}(:|.)) )\s?(?P<Token>\[T\:\d \])?\s?(?P<event_type>\{\w \:\d \})?\s(?P<message>((.*\n))(?=\d{2}\:))

This gets me to the end of the line but wont pick up the next lines not starting with the time:

/(?P<time>^(\d{2}(:|.)) )\s?(?P<Token>\[T\:\d \])?\s?(?P<event_type>\{\w \:\d \})?\s(?P<message>.*)/gm

I put the expected part in BOLD to be in the group 'message'

15:36:32.448 [T:1401135292433] {ScxmlMetric:3} **METRIC <log sid='~28~01TF8DKFD49SREES00000J' expr='~28~01TF8DKFD49SRE6Q9PE0C2LAES00000J: Inside Interaction Block: ScreenForPriorities' label='' level='2' />**
15:36:32.448 [T:1401135292433] {ScxmlMetric:1} **METRIC <extension sid='~283~01TF8DKFD49SRE6Q9PE0C2LAES0000' name='screen' namespace='http://www.slab.com/modules/classification' />**
15:36:32.448 ==>Connector::EventHandler Port=0 Proto=0 CallBack=<97446>
===>event:   event_id=3, id=0 handle=66, datasize=24
15:36:32.448 {TSync:3} HandleThreadData: << 24 bytes <<
15:36:32.448 {ILink:3} **Message 'external_service_request' sent to 'I2P'
    attr_ref_id [int] = 8165875
    attr_envelope [list, size (unpacked)=369] = 
       'Version' [str] = "1.0"
       'AppType' [int] = 90**
15:36:32.460 {SManager:1} **[IX]: >> GET >> (FMID=000ADaHVQEJC00 / SESSID=~28~01TF8DKFD49SRE6Q9PE0C2LAES0)**

message1

METRIC <log sid='~28~01TF8DKFD49SREES00000J' expr='~28~01TF8DKFD49SRE6Q9PE0C2LAES00000J: Inside Interaction Block: ScreenForPriorities' label='' level='2' />

message2

METRIC <extension sid='~283~01TF8DKFD49SRE6Q9PE0C2LAES0000' name='screen' namespace='http://www.slab.com/modules/classification' />

message3

==>Connector::EventHandler  Port=0 Proto=0 CallBack=<97446>
===>event:   event_id=3, id=0 handle=66, datasize=24

message4

Message 'external_service_request' sent to 'I2P
    attr_ref_id [int] = 8165875
    attr_envelope [list, size (unpacked)=369] = 
       'Version' [str] = "1.0"
       'AppType' [int] = 90

message5

[IX]: >> GET >> (FMID=000ADaHVQEJC00 / SESSID=~28~01TF8DKFD49SRE6Q9PE0C2LAES0)

CodePudding user response:

You could try this pattern:

/(?P<time>\d{2}:\d{2}:\d{2}.\d{3})\s?(?P<token>\[T\:\d \])?\s?(?P<event_type>\{\w \:\d \})?\s?(?P<message>(.|\n)*?)(?=\d{2}:|$)/g

Demo: https://regex101.com/r/zGyuMy/1

CodePudding user response:

In the example data there seem to be 6 messages that would match the pattern, including:

15:36:32.448 {TSync:3} HandleThreadData: << 24 bytes <<

You might use:

^(?P<time>\d{1,2}:\d{1,2}:\d{1,2}\.\d{3})(?:\s (?P<token>\[T:\d ]))?(?:\s (?P<event_type>\{\w :\d }))?\s (?P<message>.*(?:\n(?!\d{1,2}:).*)*)

Explanation

  • ^ Start of string
  • (?P<time>\d{1,2}:\d{1,2}:\d{1,2}\.\d{3}) Named group time, match a time like pattern
  • (?:\s (?P<token>\[T:\d ]))? Optional non capture group with named group token, match 1 whitespace chars [T 1 digits and ]
  • (?:\s (?P<event_type>\{\w :\d }))? Optional non capture group with named group event_type, match 1 whitespace chars { 1 word chars : 1 digits and the }
  • \s Match 1 whitespace chars
  • (?P<message> Named group message
    • .* Match the whole line
    • (?: Non capture group to repeat as a whole part
      • \n Match a newline
      • (?!\d{1,2}:) Negative lookahead, assert the the line does not start with 1-2 digits and :
      • .* Match the whole line
    • )* Close the non capture group and optionally repeat to match all lines
  • ) Close group message

See a regex101 demo.

CodePudding user response:

I took another look, this works by lookahead for the first group time. I aslo added a few of these (\[T\:)? (\]\s{)? (\}\s)?to leave just the value for the group

(?P<time>^(\d{2}\:\d{2}\:\d{2}.\d{3}))\s(\[T\:)?(?P<token>\d{15})?(\]\s{)?(?P<event_type>.*\:\d)?(\}\s)?(?P<message>[\s\S]*?)(?=(?1))

  • Related