I'm trying to figure out a regex pattern that splits the following string:
2022-09-22T03:55:59.433Z,,,,sm100,"sm100.w.gm.net=25 2.7.2 mailto:[email protected] [IId=200023, Hostname=mky.wgm.net] Queued info",,SMTP,HAREDIRECT
into the following values:
2022-09-22T03:55:59.433Z,
,
,
,
sm100,
"sm100.w.gm.net=25 2.7.2 <mailto:[email protected]> \[IId=200023, Hostname=mky.wgm.net\] Queued info",
,
SMTP,
HAREDIRECT
I do not want the regex expression to split the values in row #6 (the longest row) by comma, because even if there is a comma after IId=200023, the entire string should be considered atomic because it is enclosed with double quotes.
I have tried a lot of patterns inside
It seems to have identified the commas correctly, but I can't find a way to change my regex pattern to find these groups.
CodePudding user response:
print txt = '2022-09-22T03:55:59.433Z,,,,sm100,"sm100.w.gm.net=25 2.7.2 mailto:[email protected] [IId=200023, Hostname=mky.wgm.net] Queued info",,SMTP,HAREDIRECT'
| project parse_csv(txt)
txt |
---|
["2022-09-22T03:55:59.4330000Z","","","","sm100","sm100.w.gm.net=25 2.7.2 mailto:[email protected] [IId=200023, Hostname=mky.wgm.net] Queued info","","SMTP","HAREDIRECT"] |