Trying to parse the output of monitoring plugins I ran into a problem where the match result was unexpected by me:
First consider this debugger session with Perl 5.18.2:
DB<6> x $_
0 'last=0.508798;;;0'
DB<7> x $RE
0 (?^u:^((?^u:\'[^\'=] \'|[^\'= ] ))=((?^u:\\d (?:\\.\\d*)?|\\.\\d ))(s|%|[KMT]?B)?(;(?^u:\\d (?:\\.\\d*)?|\\.\\d )?){0,4}$)
-> qr/(?^u:^((?^u:'[^'=] '|[^'= ] ))=((?^u:\d (?:\.\d*)?|\.\d ))(s|%|[KMT]?B)?(;(?^u:\d (?:\.\d*)?|\.\d )?){0,4}$)/
DB<8> @m = /$RE/
DB<9> x @m
0 'last'
1 0.508798
2 undef
3 ';0'
DB<10>
OK, the regex $RE
(intended to match "'label'=value[UOM];[warn];[crit];[min];[max]") looks terrifying at a first glance, so let me show the construction of it:
my $RE_label = qr/'[^'=] '|[^'= ] /;
my $RE_simple_float = qr/\d (?:\.\d*)?|\.\d /;
my $RE_numeric = qr/[- ]?$RE_simple_float(?:[eE][- ]?\d )?/;
my $RE = qr/^($RE_label)=($RE_simple_float)(s|%|[KMT]?B)?(;$RE_simple_float?){0,4}$/;
The relevant part is (;$RE_simple_float?){0,4}$
intended to match ";[warn];[crit];[min];[max]" (still not perfect), so for ";;;0" I'd expect @m
to end with ';', ';', ';0'
.
However it seems the matches are lost, except for the last one.
Did I misunderstand something, or is it a Perl bug?
CodePudding user response:
When you use {<number>}
(or
or *
for that matter) after a capture group, only the last value that is matched by the capture group is stored. This explain why you only end up with ;0
instead of ;;;0
in your fourth capture group: (;$RE_simple_float?){0,4}
sets the fourth capture group to the last element it matches.
Top fix that, I would recommend to match the whole end of the string, and split it afterwards:
my $RE = qr/...((?:;$RE_simple_float?){0,4})$/;
my @m = /$RE/;
my @end = split /;/, $m[3]; # use /(?<=;)/ to keep the semicolons
Another solution is to repeat the capture group: replace (;$RE_simple_float?){0,4}
with
(;$RE_simple_float?)?(;$RE_simple_float?)?(;$RE_simple_float?)?(;$RE_simple_float?)?
The capture groups that do not match will be set to undef
. This issue with this approach is that it's a bit verbose, and only works for {}
, but not for
or *
.
CodePudding user response:
Following demo code utilizes split
to obtain data of interest. Investigate if it will fit as a solution for your problem.
use strict;
use warnings;
use feature 'say';
use Data::Dumper;
while( <DATA> ) {
chomp;
say;
my $record;
$record->@{qw/label value warn crit min max/} = split(/[=;]/,$_);
say Dumper($record);
}
exit 0;
#'label'=value[UOM];[warn];[crit];[min];[max]
__DATA__
'label 1'=0.3345s;0.8s;1.2s;0.2s;3.2s
'label 2'=10%;7%;18%;2%;28%
'label 3'=0.5us;2.3us
Output
'label 1'=0.3345s;0.8s;1.2s;0.2s;3.2s
$VAR1 = {
'crit' => '1.2s',
'warn' => '0.8s',
'value' => '0.3345s',
'label' => '\'label 1\'',
'max' => '3.2s',
'min' => '0.2s'
};
'label 2'=10%;7%;18%;2%;28%
$VAR1 = {
'min' => '2%',
'max' => '28%',
'label' => '\'label 2\'',
'value' => '10%',
'warn' => '7%',
'crit' => '18%'
};
'label 3'=0.5us;2.3us
$VAR1 = {
'min' => undef,
'max' => undef,
'label' => '\'label 3\'',
'warn' => '2.3us',
'value' => '0.5us',
'crit' => undef
};