I am not sure, if regex supports this. I want to extract all the mail addresses from the "TO:" line only. This is the given string:
Content-Type: application/ms-tnef; name="winmail.dat"
Content-Transfer-Encoding: binary
From: Max Mustermann <[email protected]>
To: autorouter.test <[email protected]>, Max Mustermann<[email protected]>
CC: Max Mustermann <[email protected]>, Max Mustermann<[email protected]>
Subject: Subject-Foobar
Thread-Topic: Subject-Foobar
Thread-Index: AdiHB4KcplQHHfCjQW 1j4r7qtj8wg==
Date: Thu, 23 Jun 2022 15:51:03 0200
Message-ID: <[email protected]>
Accept-Language: de-DE, en-US
Content-Language: de-DE
X-MS-Has-Attach:
X-MS-Exchange-Organization-SCL: -1
X-MS-TNEF-Correlator: <[email protected]>
I can select all mail addresses with "<.*>", but not if I try to restrict it to the lines starting with "To:".
This would be the desired output:
- vr.test@foo-gruppe
- [email protected]
Is this possible?
CodePudding user response:
In Java you can use a finite quantifier in a positive lookbehind assertion:
(?<=^To:.{0,1000})<[^@<>] @[^@<>] >
Explanation
(?<=
Assert what is to the left is^
Start of stringTo:
Match literally.{0,1000}
Optionally repeat any character except a newline 0 - 1000 times (change it accordingly)
)
Close the lookbehind<
Match the opening <[^@<>]*@[^@<>]*
Match an @ char between 1 chars to the left and right other than < and >>
Match the closing>
See a regex demo
You might also capture what is in between the angle brackets, and if the @ part in not nessecary only use the negated character class:
(?<=^To:.{0,1000})<([^<>]*)>
CodePudding user response:
For that, I believe that you to use the 'positive lookbehind', like this:
(?<=To:.*?)([\w.-] @[\w.-] )
Or
(?<=To:.*?)<(. ?)>