Is there a fast and idiomatic way to slice a sorted Perl list/array by value?-CodePudding

I have a piece of code that extracts a slice from one of two large arrays of sorted integers, representing stopping points in the program's workflow. I'll include one of the two here.

The basic idea is that I'm trying to slice a work range out of this large array as a starting point for this program to work on. $min is pulled from the task object, representing the current progress of the task. $limit is an optional user override, defaulting to -1 (which is ignored).

Currently, I'm using the firstidx function from the List::MoreUtils CPAN module to retrieve the indices for the start and finish, and then I'm using them to slice the @steps array in the usual way. Is there a faster and/or more idiomatic way of doing this? Particularly, is there a good way to do it by directly using the $min and $limit values (with another codepath for the $limit == -1 case)?

Here's the code:

my @steps = (
    0, 1, 5, 10,
    20, 30, 40, 50, 60, 70, 80, 90, 100,
    200, 300, 400, 500, 600, 700, 800, 900, 1000,
    1200, 1400, 1600, 1800, 2000,
    2500, 3000, 3500, 4000, 4500, 5000,
    6000, 7000, 8000, 9000, 10000,
    11000, 12000, 13000, 14000, 15000, 17500, 20000,
    25000, 30000, 35000, 40000, 45000, 50000,
    60000, 70000, 80000, 90000, 100000
);
my $min_index = firstidx { $_ > $min } @steps;
my $max_index;
if ($limit == -1) {
    $max_index = @steps - 1;
} else {
    $max_index = firstidx { $_ >= $limit } @steps;
}
my @steps_todo = @steps[ $min_index .. $max_index ];

CodePudding user response：

An idiomatic way would be to use grep to select the range. That has the disadvantage that it scans the entire array, which might be a performance point if the array is large and the grep is executed often.

Since the list is sorted, a possibility for performance (but certainly no shorter or simpler) is to use the binary search functions from List::MoreUtils to find the bounds of the range.

CodePudding user response：

You can do this with a simple foreach loop. The flow control keywords control when you start or stop consuming the @steps array.

I changed the no-limit case to be a defined check since that's more efficient.

This is not the most efficient possible algorithm but it's a simple idomatic way of solving it.

my ($min, $max) = (1000, 9999);
our @out;

my @steps = (
    0, 1, 5, 10,
    20, 30, 40, 50, 60, 70, 80, 90, 100,
    200, 300, 400, 500, 600, 700, 800, 900, 1000,
    1200, 1400, 1600, 1800, 2000,
    2500, 3000, 3500, 4000, 4500, 5000,
    6000, 7000, 8000, 9000, 10000,
    11000, 12000, 13000, 14000, 15000, 17500, 20000,
    25000, 30000, 35000, 40000, 45000, 50000,
    60000, 70000, 80000, 90000, 100000
);



foreach (@steps) {
    next if $_ < $min;
    last if defined $max and $_ > $max;
    push @out, $_;
}


print "@out";


## output
## 1000 1200 1400 1600 1800 2000 2500 3000 3500 4000 4500 5000 6000 7000 8000 9000