Home > front end >  Regular expression chaining/mixing in perl
Regular expression chaining/mixing in perl

Time:07-09

consider the following:

my $p1             = "(a|e|o)";
my $p2              = "(x|y|z)";
$text =~  s/($p1)( )[$p2]([0-9])/some action to be done/g;

is the regular expression pattern in string form equal to concatenation of the elements in the above, meaning the above can be written as

$text =~  s/((a|e|o))( )[(x|y|z)]([0-9])/ some action to be done/g;

As you could see I am new to perl and try to make sense of the RE of the above forms. Thanks for your help Bid

CodePudding user response:

Well, yes, variables in a pattern get interpolated into the pattern, in a double-quoted context, and the expressions you show are equivalent. See discussion in perlretut tutorial. (I suggest using qr operator for that instead. See discussion of that in perlretut as well.)

But that pattern clearly isn't right

  • Why those double parenthesis, ((a|e|o))? Either have the alternation in the variable and capture it in the regex

    my $p1 = 'a|e|o';  # then use in a regex as
    /($p1)/            # gets interpolated into: /(a|e|o)/
    

    or indicate capture in the variable but then drop parens in the regex

    my $p1 = '(a|e|o)';  # use as
    /$p1/                # (same as above)
    
  • A pattern of [(x|y|z)] matches either one of the characters (, x, |,... (etc) -- that [...] is the character class, which matches either of the characters inside (a few have a special meaning). So, again, either use the alternation and capture in your variable

    my $p2 = '(x|y|z)';  # then use as
    /$p2/
    

    or do it using the character class

    my $p2 = 'xyz';  # and use as
    /([$p2])/        # --> /([xyz])/
    

So altogether you'd have something like

use warnings;
use strict;
use feature 'say';

my $text = shift // q(e z7);

my $p1 = 'a|e|o';
my $p2 = 'xyz';

$text =~ s/($p1)(\s)([$p2])([0-9])/replacement/g;

say $_ // 'undef'  for $1, $2, $3, $4;

I added \s instead of a literal single space, and I capture the character-class match with () (the pattern from the question doesn't), since that seems to be wanted.

CodePudding user response:

Neither snippets are valid Perl code. They are therefore equivalent, but only in the sense that neither will compile.


But say you have a valid m//, s/// or qr// operator. Then yes, variables in the pattern would be handled as you describe.

For example,

my $p1 = "(a|e|o)";
my $p2 = "(x|y|z)";
$text =~ /($pl)( )[$p2]([0-9])/g;

is equivalent to

$text =~ /((a|e|o))( )[(x|y|z)]([0-9])/g;

As mentioned in an answer to a previous question of yours, (x|y|z) is surely a bug, and should be xyz.

  • Related