The regex contains a capture group, but the substitution pattern is not interpolated to reference the match variable $1
in
use strict;
use warnings;
my $regex = '([^ ] )e s';
my $subst = '$1 ';
my $text = 'fine sand';
print $text =~ s/$regex/$subst/r;
print "\n";
The result is
$1 and
The solution to Perl regular expression variables and matched pattern substitution suggests to use the e
modifier and eval
in the substitution; and indeed
print $text =~ s/$regex/eval $subst/er;
would give the desired
finand
However, in my situation, the pattern and substitution strings are read from third party user input, so they cannot be considered safe for eval
. Is there a way to interpolate the substitution string in a more secure way than to execute it as code? All I seek here is to expand all match variables contained in the substitution string.
The best I can currently think of involves an idiom like
$text =~ /$regex/;
sprintf $subst, $1, $2, $3, $4, $5, $6, $7, $8, $9, $10, $11, $12, $13, ...
This would require a slight change in syntax for the substitution string, but I consider this acceptable. However, the set of imaginable match variables is infinite, in particular named match variables would not be supported.
CodePudding user response:
Here's a solution:
- use capture groups to pick up all the groups
- replace
$\d
in$subst
with the entries of the capture groups - now do the substitution using the interpolated
$subst
.
use strict;
use warnings;
my $regex = '([^ ] )e s';
my $subst = '$1 ';
my $text = 'fine sand';
print $text =~ s{$regex}{
my @captured = @{^CAPTURE};
$subst =~ s/\$([1-9]\d*)/$captured[$1-1]/rg
}er . "\n";
The re expression is safe since we only match digits and use that to index into @captured
. You'll need to add bounds checking. @ikegami
correctly notes it doesn't handle escaped $
either, which isn't hard to address.
Note also that using an untrusted $regex
creates risk of DDOS-style attacks.
CodePudding user response:
use String::Substitution qw( sub_copy );
print sub_copy( $text, $regex, $subst );
Note that while this is safe from accidental/malicious interpolation and accidental/malicious code execution, it's not safe in every sense of the word. Specifically, it's quite easy to craft a regex and string combination that will take longer than the lifespan of the universe to match.