I try since 2 day to write an Regex who capture some information from my postmaster digest.
Exemple:
0.32768:0A006832, 4.33024:DD040000 [Stage: CreateMessage]Final-Recipient: rfc822;[email protected]: failedStatus: 5.2.2Diagnostic-Code: smtp;554 5.2.2 mailbox full;
I want to capture sentence like that:
- Final-Recipient:
- Action:
- failedStatus:
- Diagnostic-Code:
- Remote-MTA:
BUT i dont want to capture
- Stage:
I wrote a regex who work perfectly fine for capturing :
([A-Z]{1}[a-z] \-)?[A-Z]{1,3}[a-z]*\:\
But sadly i dont know how to says to my regex to NOT capturing sentences that start with a "["
i tried this :
[^\[]([A-Z]{1}[a-z] \-)?[A-Z]{1,3}[a-z]*\:\
This avoid capturing "[Stage:" but capture one caracters before each other captured sentences.
Anyone know how to capture my postmaster errors ?
Thanks in advance.
CodePudding user response:
Add (?<!(\[))
before your first regex. the final result would be what you want.
complete answer:
(?<!(\[))([A-Z]{1}[a-z] \-)?[A-Z]{1,3}[a-z]*\:\
explanation:
You want to prevent having [
element before your phrase which in regex would be (\[)
and you want to don't have it before phrase which means you want to use not equal lookBehind
. in regex ?<
is lookBehind and !
is not.
so what you need is ?<!(\[)
CodePudding user response:
Using sed
, you can use capture groups for the first part that matches any character except ]
and another group for the whole last part including the optional capture group inside.
Use those in the replacement with a newline between group 1 and group 2 \1\n\2
Note that your pattern would not match failedStatus:
as it does not start with a capital letter.
Also you can omit this quantifier {1}
as 1 is the default, and you don't have to escape \-
and \:
and \
sed -E 's/([^\[])(([A-Z][a-z] -)?[A-Z]{1,3}[a-z]*: )/\1\n\2/g' File.eml
Output
0.32768:0A006832, 4.33024:DD040000 [Stage: CreateMessage]
Final-Recipient: rfc822;[email protected]
Action: failed
Status: 5.2.2