Home > OS >  Searching Perl array with regex and return single capturing group only
Searching Perl array with regex and return single capturing group only

Time:10-06

I have a Perl script in which I perform web service calls in a loop. The server returns a multivalued HTTP header that I need to parse after each call with information that I will need to make the next call (if it doesn't return the header, I want to exit the loop).

I only care about one of the values in the header, and I need to get the information out of it with a regular expression. Let's say the header is like this, and I only care about the "foo" value:

X-Header: test-abc12345; blah=foo
X-Header: test-fgasjhgakg; blah=bar

I can get the header values like this: @values = $response->header( 'X-Header' );. But how do I quickly check if

  1. There is a foo value, and
  2. Parse and save the foo value for the next iteration?

Ideally, I'd like to do something like this:

my $value = 'default';

do {
  # (do HTTP request; use $value)
  @values = $response->header( 'X-Header' );
} while( $value = first { /(?:test-)([^;] )(?:; blah=foo)/ } @values );

But grep, first (from List::Util), etc. all return the entire match and not just the single capturing group I want. I want to avoid cluttering up my code by looping over the array and matching/parsing inside the loop body.

Is what I want possible? What would be the most compact way to write it? So far, all I can come up with is using lookarounds and \K to discard the stuff I don't care about, but this isn't super readable and makes the regex engine perform a lot of unnecessary steps.

CodePudding user response:

Following code snippet is looking for foo stored in a variable $find, the found values is stored in variable $found.


my $find = 'foo';
my $found;

while( $response->header( 'X-Header' ) ) {
    if( /X-Header: .*?blah=($find)/ ) {
        $found = $1;
        last;
    }
}

say $found if $found;

Sample demo code

use strict;
use warnings;
use feature 'say';

use Data::Dumper;

my $find = 'foo';
my $found;
my @header = <DATA>;

chomp(@header);

for ( @header ) {
    $found = $1 if /X-Header: .*?blah=($find)/;
    last if $found;
}

say Dumper(\@header);
say "Found: $found" if $found;

__DATA__
X-Header: test-abc12345; blah=foo
X-Header: test-fgasjhgakg; blah=bar

Output

$VAR1 = [
          'X-Header: test-abc12345; blah=foo',
          'X-Header: test-fgasjhgakg; blah=bar'
        ];

Found: foo

CodePudding user response:

So it seems that you want to catch the first element with a certain pattern, but acquire only the pattern. And you want it done nicely. Indeed, first and grep only pass the element itself.

However, List::MoreUtils::first_result does support processing of its match

use List::MoreUtils qw(first_result);

my @w = qw(a bit c dIT);  # get first "it" case-insensitive

my $res = first_result { ( /(it)/i )[0] } @w;

say $res // 'undef';  #--> it

That ( ... )[0] is needed to put the regex in the list context so that it returns the actual capture. Another way would be firstres { my ($r) = /(it)/i; $r }. Pick your choice


With your data

use warnings;
use strict;
use feature 'say';

use List::MoreUtils qw(firstres);

my @data = ( 
    'X-Header: test-abc12345; blah=foo',
    'X-Header: test-fgasjhgakg; blah=bar'
);

if (my $r = firstres { ( /(blah=foo)/ )[0] } @data) {
    say $r
}

Prints blah=foo. Clarify what exactly you'd like to capture out of what for a nicer example :)

  • Related