Home > Back-end >  Why is there a "erase–remove idiom" for std::remove, std::remove_if with containers?
Why is there a "erase–remove idiom" for std::remove, std::remove_if with containers?

Time:12-21

I was just looking into why the function std::remove_if wasn't working the way I expected, and learned about the C "erase-remove idiom" where I'm supposed to pass the result of remove or remove_if to erase to actually remove the items I want from a container.

This strikes me as quite unintuitive: this means remove and remove_if don't do what they say on the tin. It also makes for more verbose, less clear code.

Is there a justification for this? I figure there's some kind of trade-off with an upside balancing out the downsides I listed in the previous paragraph.

My first thought would be that there's some use-case for using remove or remove_if on their own, but since they leave the remaining items in a collection in an undefined state, I can't think of any possible use case for that.

CodePudding user response:

This is a necessary function of the way the container/iterator/algorithm paradigm works. The basic concept of the model is as follows:

  • Containers contain and manage sequences of values.
  • Algorithms act on sequences of values.
  • Iterators represent moveable positions within a sequence of values.

Therefore, algorithms act on iterators, which represent locations within some sequence of values, usually provided by a container.

The problem is that removal of an item from a container doesn't fit that paradigm. The removal of an element from a container is not an act on a "sequence of values"; it fundamentally changes the nature of the sequence itself.

That is, "removal" ultimately finishes with a container operation, not an iterator operation. If algorithms only act on iterators, no pure algorithm can truly remove elements. Iterators don't know how to do that. Algorithms that only act on iterators can move values around within a sequence, but they cannot change the nature of the sequence such that the "removed" values no longer exist.

But while the removal of elements is a container operation... it's not a value agnostic operation. remove removes only values that compare equal to the given value. remove_if removes only values for which the predicate returns true. These are not container operations; they are algorithms that don't really care about the nature of the container.

Except for when it comes time to actually remove them from the container. From the perspective of the above paradigm, it is inherently two separate operations: an algorithm followed by a container operation.

That all being said, C 20 does give a number of containers non-member std::erase and std::erase_if specializations. These do the full job of erase-remove as a non-member function.

My first thought would be that there's some use-case for using remove or remove_if on their own, but since they leave the remaining items in a collection in an undefined state, I can't think of any possible use case for that.

There are uses for it. Multiple removal being the obvious one. You can perform a series of remove actions, so long as you pass the new end iterator to each subsequent removal (so that no operation examines removed elements). You can do a proper container erase at the end.

It should also be noted that the C 20 std::erase and std::erase_if functions only take containers, not sub-sections of containers. That is, they don't allow you to erase from some range within a container. Only the erase/remove idiom allows for that.

Also, not all containers can erase elements. std::array has a fixed size; truly erasing elements isn't allowed. But you can still std::remove with them, so long as you keep track of the new end iterator.

CodePudding user response:

Many algorithms from the standard library operate on general iterators, which cannot be used to remove elements. erase is a method of the container and has access to more information, so it can be used to directly delete elements.

  • Related