Home > Back-end >  Is there better way to save the substitution part of regular expression in Perl?
Is there better way to save the substitution part of regular expression in Perl?

Time:10-12

$search_32bit  = '(80 71 C3 (\S{8}) (77 55 66))';
$search_32bit =~ s/\s //g;
$replace_32bit = 'A0 B0 C0 \2\3';
$replace_32bit =~ s/\s //g;

        @repls_32 = (
                [ $search_32bit, $replace_32bit],
                );

$hex = "9090908071C312345678775566000000777777";

foreach my $r (@repls_32) {

        $hex_tmp = $hex;
        (my $s_sign, my $mat_pos) = eval "\$hex =~ s/$r->[0]/$r->[1]/i;return (\$1, \$-[0])";
        $len = length($s_sign);
        $replaced_str = substr($hex, $mat_pos, $len);
        print "matched_str: $s_sign\n";
        print "mat_pos: $mat_pos\n";
        print "length: $len\n";
        print "replaced_str: $replaced_str\n";
}

output as below:

matched_str: 8071C312345678775566
mat_pos: 6
length: 20
replaced_str: A0B0C012345678775566

My question:

Is there better way to save the substitution part of regular expression(i.e. $replaced_str: A0B0C012345678775566)?

CodePudding user response:

One way to acquire the replacement string, presumably built dynamically as the regex runs

my $repl_str;
$str =~ s{$pattern}{ $repl_str = code-to-build-replacement }e;

A concrete example: pad the the capture with periods

my $str = 'funny';

my $repl_str;
$str =~ s{ (.{3}) }{ $repl_str = '.' . $1 . '.' }ex;

say $repl_str;  #--> .fun.
say $str;       #--> .fun.ny

The point is that the assignment expression inside the replacement part returns as well, so the replacement string is still available for the regex; doing this doesn't break things. There is now another variable floating around, but I don't think that there is a built-in regex variable for the replacement string.

Depending on the needed replacement, this may not always work by just adding that assignment; remember that with /e the replacement side is code. So what would be .$1. above in a "normal" replacement (without /e) need be adjusted to valid code, where . strings are concatenated to the capture, and then we can add assignment and run that.

I hope that you can adapt this to that eval, for which I have no idea why you'd need it -- other than to perhaps run code supplied from outside, as a string? Even then I'd expect that only parts of the regex are supplied and you still write the regex and can do the above.

A clarification of a some points in the question would help but hopefully this is useful as it stands. (In the future consider building a simple and clear example for what you need, if possible.)


In this specific case, there is a complication where those \1 (etc) appear to be meant to be capture groups. Then they need be $1 (etc), and need a bit more work to use out of a variable

One way is to build the actual replacement string out of the given $replacement

use warnings;
use strict;

my $str = shift // '9090908071C312345678558765432166aaaabbbb7700abc' ; 
say $str; 

my $search  = qr/8071C3(\S{8})55(\S{8})66(\S{8})77/; 
my $replace = q(A0B0C0$155$266$377);

my $repl_str;
$str =~ s/$search/$repl_str = build_repl($replace, $1, $2, $3)/e; 

say $str; 
say $repl_str;

sub build_repl {
    my ($r, @captures) = @_; 

    my @terms = grep { $_ } split /(\$\d)/, $r; 

    my $str = join '', map { /^\$/ ? shift(@captures) : $_ } @terms;
    return $str;
}

This prints

9090908071C312345678558765432166aaaabbbb7700abc
909090A0B0C012345678558765432166aaaabbbb7700abc
A0B0C012345678558765432166aaaabbbb77

I use qr to build a regex pattern, but '' (or q()) works here as well.

There are other ways to organize code to build that replacement string. Please see this post for a similar detailed example with more explanation.

The sub above takes the captures and so has them on hand and can compose the replacement string. That /e is needed so to run the sub out of a regex, once the captures are available.

The linked answer builds code and uses /ee. Please read carefully about the warnings to that and follow links. I don't see that that is needed here.

CodePudding user response:

First, let's fix your bugs.

use String::Substitution qw( sub_modify );

my $search  = qr/(80 71 C3 (\S{8}) (77 55 66))/xi;
my $replace = 'A0 B0 C0 $2$3' =~ s/\s //gr;          # \1 should be $1 in replacement.

sub_modify($hex, $search, $replace);                # Fix code injection bugs

This is equivalent to

use String::Substitution qw( sub_modify interpolate_match_vars );

my $search  = qr/(80 71 C3 (\S{8}) (77 55 66))/xi;
my $replace = 'A0 B0 C0 $2$3' =~ s/\s //gr;

sub_modify($hex, $search, sub { interpolate_match_vars($replace, @_) });

Now, it's just a question of saving the string.

use String::Substitution qw( sub_modify interpolate_match_vars );
 
my $search  = qr/(80 71 C3 (\S{8}) (77 55 66))/xi;
my $replace = 'A0 B0 C0 $2$3' =~ s/\s //gr;

my $replacement_str;
sub_modify($hex, $search, sub { $replacement_str = interpolate_match_vars($replace, @_) });

The above has a very inside-out feel, though. It can be turned inside out as follows:

use String::Substitution qw( interpolate_match_vars last_match_vars );
 
my $search  = qr/(80 71 C3 (\S{8}) (77 55 66))/xi;
my $replace = 'A0 B0 C0 $2$3' =~ s/\s //gr;

my $replacement_str;
if ($hex =~ $search) {
   $replacement_str =
      substr($hex, $-[0], $ [0] - $-[0]) =
         interpolate_match_vars($replace, last_match_vars());
}
  • Related