I ACE | AA33cc55BB44 | | | I
I | AAAAAA-BB2CC-4424-1-22 | 11.113 | 10.09.2022 | bCa0111.XAC I
I | | | | I
I BAC | Aa315c5cab44 | | | I
I | 5564aa-BB2CC-44gd-1-22 | 21.334 | 10.09.2022 | Aba0221.XAC I
I | | | | I
I CAC | aacccc54BB44 | | | I
I | AAAAAA-BB2CC-aaaa-1-22 | 61.222 | 10.09.2022 | bCa0232.XAC I
I | | | | I
I DAC | ii2ii2ii2664 | | | I
I | BBBBBB-BB2CC-4424-1-22 | 81.888 | 10.09.2022 | Aba0243.XAC I
I have used this pattern:
\| (.*) \| \d{2}\.\d{3} \| \d{1,2}\.\d{1,2}\.\d{4} \| (.*) \I
Attributes that I want to grab:
Group I:
AA33cc55BB44
AAAAAA-BB2CC-4424-1-22
bCa0111.XAC
Group II:
Aa315c5cab44
5564aa-BB2CC-44gd-1-22
Aba0221.XAC
Group III:
aacccc54BB44
AAAAAA-BB2CC-aaaa-1-22
bCa0232.XAC
Group IV:
ii2ii2ii2664
BBBBBB-BB2CC-4424-1-22
Aba0243.XAC
Can anyone help me how I can get only these attributes from this text?
CodePudding user response:
You can use
(?m)^[^|\n]*\|[ \t]*([^\s|] ).*\n[^|\n]*\|[ \t]*(\S )\s*(?:\|[^|\n]*){2}\|[ \t]*(\S )
See the regex demo. Details:
(?m)
-RegexOptions.Multiline
option on^
- start of a line[^|\n]*
- zero or more chars other than a newline and|
\|
- a|
char[ \t]*
- zero or more spaces or TABs (you may use[\p{Zs}\t]*
here to match any Unicode horizontal whitespaces)([^\s|] )
- Group 1: one or more chars other than whitespace and|
.*
- the rest of the line\n
- a newline char[^|\n]*\|[ \t]*
- zero or more chars other than a newline and|
, then a|
char and zero or more spaces or TABs(\S )
- Group 2: one or more non-whitespace chars\s*
- zero or more whitespaces(?:\|[^|\n]*){2}
- two sequences of|
and then zero or more chars other than|
and whitespace\|
- a|
char[ \t]*
- zero or more spaces or TABs(\S )
- Group 3: one or more non-whitespace chars.
In C#:
var pattern = @"^[^|\n]*\|[ \t]*([^\s|] ).*\n[^|\n]*\|[ \t]*(\S )\s*(?:\|[^|\n]*){2}\|[ \t]*(\S )";
var matches = Regex.Matches(text, pattern, RegexOptions.Multiline);
for (Match m in matches)
{
Console.WriteLine("--- New match ---");
Console.WriteLine(m.Groups[1].Value);
Console.WriteLine(m.Groups[2].Value);
Console.WriteLine(m.Groups[3].Value);
}