Home > other >  Using a backreference as key in a hashmap within a regex-substitution?
Using a backreference as key in a hashmap within a regex-substitution?

Time:06-06

I am learning the Perl language and I stumbled upon the following question: Is it possible to use a backreference as a key in a substitution argument, e.g. something like:

$hm{"Cat"} = "Dog";
while(<>){
s/Cat/$hm{\1}/
print;
}

That is, I want to tell Perl to look up a key which is contained in a capture argument.

I know that this is a silly example. But I am just curious on the question as to whether it is possible to use such a key-lookup with a backreference in a substitution.

CodePudding user response:

Use $1 instead.

While backrefs like \1 work in the substition part of a regex, it only works in string context. The $hm{KEY} is accesses an item in a hash. The KEY part can be a bareword or an expression. In an expression, \1 would be a “reference to a literal scalar with value 1” which would stringify as SCALAR(0x55776153ecb0), not a back-reference as in a string. Instead, we can access the value of captures in the regex with variables like $1.

But that requires us to capture a part of the regex. I would write it as:

s/(Cat)/$hm{$1}/;

As a rule of thumb, only use backrefs like \1 within a regex pattern. Everywhere else use capture variables like $1. If you use warnings, Perl will also tell you that \1 better written as $1, though it wouldn't have detected the particular issue in your case as the \1 was still valid syntax, albeit with different meaning.

CodePudding user response:

If you are looking at really old code, you'll see people using the \1 form on the replacement side of the substitution. Sometimes you'll see it in really new code; it's a Perl 4 thing that still works, but Perl 5 added a warning. If you have warnings turned on, perl will tell you that (although I don't know when this warning started:

$ perl5.36.0 -wpe 's/cat(dog)/\1/'
\1 better written as $1 at -e line 1.

With diagnostics you get even more information about the warning:

$ perl5.36.0 -Mdiagnostics -wpe 's/cat(dog)/\1/'
\1 better written as $1 at -e line 1 (#1)
    (W syntax) Outside of patterns, backreferences live on as variables.
    The use of backslashes is grandfathered on the right-hand side of a
    substitution, but stylistically it's better to use the variable form
    because other Perl programmers will expect it, and it works better if
    there are more than 9 backreferences.

There are many other warnings that Perl uses to show you better ways to do things.

A level above that is perlcritic, which is an opinionated set of policies about what some people find to be good style. It's not a terrible place to start before you develop your own ideas about what works for you or your team.

  • Related