Home > Enterprise >  What is the RegEx for finding 3 different numbers in consecutive lines?
What is the RegEx for finding 3 different numbers in consecutive lines?

Time:09-02

Let's say I have the following type of data:

[577]   {0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00}
[578]   {0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00,0x00}
[579]   {0x05,0x08,0x01,0x00,0x47,0x00,0x61,0x00,0x6c}
[580]   {0x05,0x08,0x01,0x00,0x47,0x00,0x61,0x00,0x6c}
[581]   {0x05,0x08,0x01,0x00,0x47,0x00,0x61,0x00,0x6c}
[582]   {0x04,0x08,0x00,0x00,0x61,0x00,0x78,0x00,0x79}
[583]   {0x04,0x08,0x00,0x00,0x61,0x00,0x78,0x00,0x79}
[584]   {0x04,0x08,0x00,0x00,0x61,0x00,0x78,0x00,0x79}
[585]   {0x04,0x08,0x00,0x00,0x61,0x00,0x78,0x00,0x79}
[586]   {0x04,0x08,0x00,0x00,0x61,0x00,0x78,0x00,0x79}
[587]   {0x04,0x08,0x00,0x00,0x61,0x00,0x78,0x00,0x79}
[588]   {0x03,0x08,0x00,0x00,0x20,0x00,0x53,0x00,0x32}
[589]   {0x03,0x08,0x00,0x00,0x20,0x00,0x53,0x00,0x32}
[590]   {0x03,0x08,0x00,0x00,0x20,0x00,0x53,0x00,0x32}
[591]   {0x03,0x08,0x00,0x00,0x20,0x00,0x53,0x00,0x32}
[592]   {0x02,0x08,0x00,0x00,0x32,0x00,0x2b,0x00,0x20}
[593]   {0x02,0x08,0x00,0x00,0x32,0x00,0x2b,0x00,0x20}
[594]   {0x02,0x08,0x00,0x00,0x32,0x00,0x2b,0x00,0x20}
[595]   {0x02,0x08,0x00,0x00,0x32,0x00,0x2b,0x00,0x20}
[596]   {0x02,0x08,0x00,0x00,0x32,0x00,0x2b,0x00,0x20}
[597]   {0x02,0x08,0x00,0x00,0x32,0x00,0x2b,0x00,0x20}
[598]   {0x01,0x08,0x00,0x00,0x2d,0x00,0x39,0x00,0x33}
[599]   {0x00,0x08,0x00,0x00,0x34,0x00,0x00,0x00,0x00}
[600]   {0x00,0x08,0x00,0x00,0x34,0x00,0x00,0x00,0x00}
[601]   {0x00,0x08,0x00,0x00,0x34,0x00,0x00,0x00,0x00}
[602]   {0x00,0x08,0x00,0x00,0x34,0x00,0x00,0x00,0x00}

The relevant data is between the braces { }.

I want to find where the first column doesn't repeat.

In the data above that would be for the row marked as "[598]". Because row "[597]" start with a '0x02', and row "[599]" starts with a '0x00'. So the '0x01' is unique.

But it could very well be that the '0x01' is a '0x09'. I mean that the number per-se don't matter, as long as it's different from the lines above and below it. Only for the first column matters though.

I've been trying with Lookarounds but it doesn't work:

(?<!.*\{(\3).*\n)(.*\{(0x\d\d))(?!.*\n.*\{(\3))

Any ideas?

Notes:

  • I'm using VSCode to find.
  • No need to capture it, just would like it to highlight.

CodePudding user response:

Try (regex101):

^\[\d \]\s {([^,] )[^{] {((?!\1|\3).{4})[^{] {((?!\1|\2).{4})

CodePudding user response:

I think the following works for what you're after (slightly improved from Andrej, and adapted to support JavaScript's flavour regex, which I believe is what VSCode uses).

Regex101

^\[\d \]\s {([^,] )[^[] ^\[\d \]\s {((?!\1)[^,] )[^[] ^\[\d \]\s {((?!\2)[^,] )[^[] $

Notes:

  • JavaScript regex doesn't appear to support the (?!\1|\3) negative lookahead syntax, so I've swapped this for a single back reference 2-vs-1, and 3-vs-2
    • Due to this, if the first and third lines have the same value in the first element, then it'll still match, which isn't ideal...
  • Matches full lines and fields, if you need/want to use this for processing too
  • This is operating over three distinct lines:
    1. ^\[\d \]\s {([^,] )[^[]
      • matches against the numeric component surrounded by [] brackets, and the first element in the {} braces
    2. ^\[\d \]\s {((?!\1)[^,] )[^[]
      • matches the same again, but instead of "the first value", it explicitly forbids the value used on the first line
      • when compared with Andrej's answer, this will capture the full element due to ((?!\1)[^,] ) vs ((?!\1).{4})
    3. ^\[\d \]\s {((?!\2)[^,] )[^[] $
      • same again, but explicitly forbids the value used on the second line
  • Related