Home > Net >  How to create a regex, not able to get second group of match pattern
How to create a regex, not able to get second group of match pattern

Time:02-04

I'm trying to create regex for below data to parse, but not able to get second matched pattern 2.2.2.2 testIp2. As don't have much exposure on regex.

Data to be parsed:

show names
names 1.1.1.1 testIp1 2.2.2.2 testIp2
name 192.168.1.1 testIp3
umesh 192.168.1.2 testIp4

The regex I could create:

^(?:name|names)(?:\s (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s (\S ))

Here is my perl code snippet:

while( $data =~ /^(?:name|names)(?:\s (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s (\S ))/mg) {
    $LOGGER->debug("IPs : $1 : $2");
}

enter image description here

In the screenshot below, please check ip 2.2.2.2 testIP2 not being matched in regex101 tool:

CodePudding user response:

If there can be an arbitrary number of repetitions, it's probably better to extract the tokens and then loop over them using a very simple regex.

if($data =~ /^names?(?:\s (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s (\S ))/) {
    my $match = $1;
    while (s/$match/^(\d{1,3}(\.\d{1,3}){3})\s /) {
      $LOGGER->debug("IPs : $1 : $2");
    }
}

CodePudding user response:

I think I would just parse this. Inside a regex, you can have code in (?{...}) structures. The variable $^N is the latest text you captured, and $^R is the result of the last (?{...}). It's a little mind bending, and maybe you don't want to wish this on anyone, but it's very powerful (and I spend a lot of the first chapter of Mastering Perl on this):

$_ = <<~'HERE';
    show names
    names 1.1.1.1 testIp1 2.2.2.2 testIp2
    name 192.168.1.1 testIp3
    umesh 192.168.1.2 testIp4
    HERE

my @found;

() = / # list context to get all the matches with /g
    ^
    names?
    (
        \h 
        ([0-9.] )(?{ $^N })
        \h 
        (\w )(?{ push @found, [ $^R, $^N ] })
    ) 
    /gxm;

use Data::Dumper;
say Dumper( \@found );

This might make more sense if you see the $^N assigned to temporary variables (although you'd need to declare these outside the regex):

() = / # list context to get all the matches with /g
    ^
    names?
    (
        \h 
        ([0-9.] )(?{ $ip = $^N })
        \h 
        (\w )(?{ $name = $^N; push @found, [ $ip, $name ] })
    ) 
    /gxm;

Here's the output. I made tuples, but you can process that in any way you like:

$VAR1 = [
          [
            '1.1.1.1',
            'testIp1'
          ],
          [
            '2.2.2.2',
            'testIp2'
          ],
          [
            '192.168.1.1',
            'testIp3'
          ]
        ];

It's a bit frustrating that this isn't easier in Perl, though. I'd much rather have something like this, where it's closer to a parser. The %- doesn't remember all of the labeled matches in a (...) , just the latest one.

while(1) {
    if( /\Gshow names\R/gc ) {
        say "Found start";
        }
    elsif( /\Gname (?<ip>\S ) (?<name>\S )\R/gc ) {
        ...
        next;
        }
    elsif( /\Gnames (?:(?<ip>\S ) (?<name>\S )\h*) \R/gc ) {
        ... something with %-, but that doesn't work ...
        next;
        }
    else { last }
    }
  • Related