I'm trying to create regex for below data to parse, but not able to get second matched pattern 2.2.2.2 testIp2
. As don't have much exposure on regex.
Data to be parsed:
show names
names 1.1.1.1 testIp1 2.2.2.2 testIp2
name 192.168.1.1 testIp3
umesh 192.168.1.2 testIp4
The regex I could create:
^(?:name|names)(?:\s (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s (\S ))
Here is my perl code snippet:
while( $data =~ /^(?:name|names)(?:\s (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s (\S ))/mg) {
$LOGGER->debug("IPs : $1 : $2");
}
In the screenshot below, please check ip 2.2.2.2 testIP2 not being matched in regex101 tool:
CodePudding user response:
If there can be an arbitrary number of repetitions, it's probably better to extract the tokens and then loop over them using a very simple regex.
if($data =~ /^names?(?:\s (\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3})\s (\S ))/) {
my $match = $1;
while (s/$match/^(\d{1,3}(\.\d{1,3}){3})\s /) {
$LOGGER->debug("IPs : $1 : $2");
}
}
CodePudding user response:
I think I would just parse this. Inside a regex, you can have code in (?{...})
structures. The variable $^N
is the latest text you captured, and $^R
is the result of the last (?{...})
. It's a little mind bending, and maybe you don't want to wish this on anyone, but it's very powerful (and I spend a lot of the first chapter of Mastering Perl on this):
$_ = <<~'HERE';
show names
names 1.1.1.1 testIp1 2.2.2.2 testIp2
name 192.168.1.1 testIp3
umesh 192.168.1.2 testIp4
HERE
my @found;
() = / # list context to get all the matches with /g
^
names?
(
\h
([0-9.] )(?{ $^N })
\h
(\w )(?{ push @found, [ $^R, $^N ] })
)
/gxm;
use Data::Dumper;
say Dumper( \@found );
This might make more sense if you see the $^N
assigned to temporary variables (although you'd need to declare these outside the regex):
() = / # list context to get all the matches with /g
^
names?
(
\h
([0-9.] )(?{ $ip = $^N })
\h
(\w )(?{ $name = $^N; push @found, [ $ip, $name ] })
)
/gxm;
Here's the output. I made tuples, but you can process that in any way you like:
$VAR1 = [
[
'1.1.1.1',
'testIp1'
],
[
'2.2.2.2',
'testIp2'
],
[
'192.168.1.1',
'testIp3'
]
];
It's a bit frustrating that this isn't easier in Perl, though. I'd much rather have something like this, where it's closer to a parser. The %-
doesn't remember all of the labeled matches in a (...)
, just the latest one.
while(1) {
if( /\Gshow names\R/gc ) {
say "Found start";
}
elsif( /\Gname (?<ip>\S ) (?<name>\S )\R/gc ) {
...
next;
}
elsif( /\Gnames (?:(?<ip>\S ) (?<name>\S )\h*) \R/gc ) {
... something with %-, but that doesn't work ...
next;
}
else { last }
}