What's the best way to `take_last(n)` a non-sized range?-CodePudding

Godbolt:

#include <range/v3/all.hpp>

namespace views3 = ranges::views;

int main()
{
    std::string_view sv = "Hello world! Goodbye world!";
    auto notBang = [](char ch) { return ch != '!'; };
    auto input = std::views::take_while(sv, notBang);
    static_assert(!std::ranges::sized_range<decltype(input)>);
    static_assert(!std::ranges::common_range<decltype(input)>);
    static_assert(std::ranges::contiguous_range<decltype(input)>);

    auto v1 = input | views3::take_last(3);  // ERROR
    std::cout << (v1 | ranges::to<std::string>) << "\n";  // rld

    auto v2 = input | views3::drop_last(3);  // OK
    std::cout << (v2 | ranges::to<std::string>) << "\n";  // Hello wo
}

Here input is a contiguous, non-common, non-sized range. range-v3's take_last(n) doesn't work on such ranges, because its strategy is simply to compute ranges::begin(input) (ranges::size(input) - n) on construction, and non-sized ranges have no ranges::size.

But drop_last(n) does work on non-sized ranges, because its strategy is to keep two iterators n elements apart and advance them in tandem. The range between drop_last's two iterators [EDIT: after it's been iterated] is exactly take_last(n) — but we can't get at it.

What's the simplest (and not-crazy-inefficient) way to get the last n elements of a non-sized range, using C 20, C 23, and/or Range-v3 facilities?

(Looking at the difference in complexity between Range-v3's 66-line take_last and 373-line drop_last, I suspect this might just be a temporary deficiency in Range-v3's take_last. Neither of them has been touched in 3 years, though...)

CodePudding user response：

You can use views::counted to convert the origin range into a sized_range (But in the worst case you need to pay O(n) time complexity to calculate the actual size of the input)

auto v1 = views3::counted(input.begin(), std::ranges::distance(input))
        | views3::take_last(3);

Demo

CodePudding user response：

The range between drop_last's two iterators is exactly take_last(n)

No, it's not. drop_last's two iterators start with begin() and begin() n. take_last's range is from end() - n to end(). These are not the same range.

drop_last works because it doesn't initially have to figure out where end() - n is. It discovers this naturally as you iterate through the range. take_last must start with end() - n. Therefore, it must compute it. And to compute it, it also must determine if end() - begin() is less than n, so that it can cap the starting iterator at begin().

And it can only do that computation in constant time if the range has a size compute-able in constant time. IE: it is sized.

There's no getting around this. If you want take_last to be constant-time, it must be given a range whose size can be computed in constant-time. If you're willing to suffer an O(n) operation (where "n" is the size of the range), you can work around this. But I would not consider that an "efficient" solution.

CodePudding user response：

Looking at the difference in complexity between Range-v3's 66-line take_last and 373-line drop_last, I suspect this might just be a temporary deficiency in Range-v3's take_last. Neither of them has been touched in 3 years, though...

This is because drop_last supports forward-but-not-sized ranges and take_last only supports sized.

You could implement a take_last to work on forward-but-not-sized as well, following the same logic that that the drop_last implementation uses: have another iterator to "probe" where the end is.

Which would basically be something like this:

template <forward_range V> requires view<V>
struct take_last_view {
    V base;
    int n;

    // only necessary if not (sized   random access)
    optional<iterator_t<V>> cached_begin;
    

    auto begin() -> iterator_t<V> {
        if constexpr (random_access_range<V> and sized_range<V>) {
            auto s = ranges::size(base);
            if (s <= n) {
                // if I want to take the last 5 out of 3 elements, that's all 3
                return ranges::begin(base);
            } else {
                return ranges::begin(base)   (s - n);
            }
        } else {
            if (not cached_begin) {
                auto it = ranges::begin(base);
                auto probe = ranges::next(it, n, ranges::end(base));
                while (probe != ranges::end(base)) {
                      it;
                      probe;
                }
                cached_begin.emplace(it);
            }
            return *cached_begin;
        }
    }

    auto end() -> sentinel_t<V> {
        return ranges::end(base);
    }
};

Ideally cached_begin is actually a conditional member, since you only need it in some cases. But otherwise, this is a viable range adaptor.

You can see an example implementation of this here, which looks like it does the right thing. It's not a complete implementation (things missing include: this could support an input sized range, this should be sized if the underlying range is sized, const-iterable if the underlying range is const-iterable, etc.).