Home > Enterprise >  perl secure interpolation of substitution string containing match variables
perl secure interpolation of substitution string containing match variables

Time:11-23

The regex contains a capture group, but the substitution pattern is not interpolated to reference the match variable $1 in

use strict;
use warnings;

my $regex = '([^ ] )e s';
my $subst = '$1 ';

my $text = 'fine sand';

print $text =~ s/$regex/$subst/r;
print "\n";

The result is

$1 and

The solution to Perl regular expression variables and matched pattern substitution suggests to use the e modifier and eval in the substitution; and indeed

print $text =~ s/$regex/eval $subst/er;

would give the desired

finand

However, in my situation, the pattern and substitution strings are read from third party user input, so they cannot be considered safe for eval. Is there a way to interpolate the substitution string in a more secure way than to execute it as code? All I seek here is to expand all match variables contained in the substitution string.

The best I can currently think of involves an idiom like

$text =~ /$regex/;
sprintf $subst, $1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, ...

This would require a slight change in syntax for the substitution string, but I consider this acceptable. However, the set of imaginable match variables is infinite, in particular named match variables would not be supported.

CodePudding user response:

Here's a solution:

  • use capture groups to pick up all the groups
  • replace $\d in $subst with the entries of the capture groups
  • now do the substitution using the interpolated $subst.
use strict;
use warnings;

my $regex = '([^ ] )e s';
my $subst = '$1 ';

my $text = 'fine sand';

print $text =~ s{$regex}{
    my @captured = @{^CAPTURE};
    $subst =~ s/\$([1-9]\d*)/$captured[$1-1]/rg
}er . "\n";

The re expression is safe since we only match digits and use that to index into @captured. You'll need to add bounds checking. @ikegami correctly notes it doesn't handle escaped $ either, which isn't hard to address.

Note also that using an untrusted $regex creates risk of DDOS-style attacks.

CodePudding user response:

use String::Substitution qw( sub_copy );

print sub_copy( $text, $regex, $subst );

Note that while this is safe from accidental/malicious interpolation and accidental/malicious code execution, it's not safe in every sense of the word. Specifically, it's quite easy to craft a regex and string combination that will take longer than the lifespan of the universe to match.

  • Related