How to create a regex, not able to get second group of match pattern-CodePudding

I'm trying to create regex for below data to parse, but not able to get second matched pattern 2.2.2.2 testIp2. As don't have much exposure on regex.

Data to be parsed:

show names
names 1.1.1.1 testIp1 2.2.2.2 testIp2
name 192.168.1.1 testIp3
umesh 192.168.1.2 testIp4

The regex I could create:

^(?:name|names)(?:\s (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s (\S ))

Here is my perl code snippet:

while( $data =~ /^(?:name|names)(?:\s (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s (\S ))/mg) {
    $LOGGER->debug("IPs : $1 : $2");
}

In the screenshot below, please check ip 2.2.2.2 testIP2 not being matched in regex101 tool:

CodePudding user response：

If there can be an arbitrary number of repetitions, it's probably better to extract the tokens and then loop over them using a very simple regex.

if($data =~ /^names?(?:\s (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s (\S ))/) {
    my $match = $1;
    while (s/$match/^(\d{1,3}(\.\d{1,3}){3})\s /) {
      $LOGGER->debug("IPs : $1 : $2");
    }
}

CodePudding user response：

I think I would just parse this. Inside a regex, you can have code in (?{...}) structures. The variable $^N is the latest text you captured, and $^R is the result of the last (?{...}). It's a little mind bending, and maybe you don't want to wish this on anyone, but it's very powerful (and I spend a lot of the first chapter of Mastering Perl on this):

$_ = <<~'HERE';
    show names
    names 1.1.1.1 testIp1 2.2.2.2 testIp2
    name 192.168.1.1 testIp3
    umesh 192.168.1.2 testIp4
    HERE

my @found;

() = / # list context to get all the matches with /g
    ^
    names?
    (
        \h 
        ([0-9.] )(?{ $^N })
        \h 
        (\w )(?{ push @found, [ $^R, $^N ] })
    ) 
    /gxm;

use Data::Dumper;
say Dumper( \@found );

This might make more sense if you see the $^N assigned to temporary variables (although you'd need to declare these outside the regex):

() = / # list context to get all the matches with /g
    ^
    names?
    (
        \h 
        ([0-9.] )(?{ $ip = $^N })
        \h 
        (\w )(?{ $name = $^N; push @found, [ $ip, $name ] })
    ) 
    /gxm;

Here's the output. I made tuples, but you can process that in any way you like:

$VAR1 = [
          [
            '1.1.1.1',
            'testIp1'
          ],
          [
            '2.2.2.2',
            'testIp2'
          ],
          [
            '192.168.1.1',
            'testIp3'
          ]
        ];

It's a bit frustrating that this isn't easier in Perl, though. I'd much rather have something like this, where it's closer to a parser. The %- doesn't remember all of the labeled matches in a (...) , just the latest one.

while(1) {
    if( /\Gshow names\R/gc ) {
        say "Found start";
        }
    elsif( /\Gname (?<ip>\S ) (?<name>\S )\R/gc ) {
        ...
        next;
        }
    elsif( /\Gnames (?:(?<ip>\S ) (?<name>\S )\h*) \R/gc ) {
        ... something with %-, but that doesn't work ...
        next;
        }
    else { last }
    }