I want to escape a string. That is, I want to copy characters, but where the input has "
or \
, I want to prepend \
on the output. I can write that easily enough, even very generically:
//! Given a range of code units (e.g., bytes in utf-8), transform them to an output range.
//! If a code unit needs escaping (per the given predicate), insert esc before that code unit in the output.
template <typename InRange, typename OutIter, typename NeedsEscapingPred, typename EscCodeUnit>
OutIter transformEscapeCodeUnits(const InRange& in, OutIter out,
NeedsEscapingPred needsEscaping,
EscChar esc) {
for (const auto& codeUnit : in) {
if (needsEscaping(codeUnit)) {
*out = esc;
}
*out = codeUnit;
}
return out;
}
//! Convenience overload for common case:
template <typename InRange, typename OutIter>
OutIter transformEscapeCodeUnits(const InRange& in, OutIter out, char esc = '\') {
return transformEscapeCodeUnits(in, out, [](auto c) { return c == '\' || c == '"'; }, esc);
}
However, in the spirit of "no raw loops", I looked at the algorithm and numeric header in search of a generic algorithm to do this. There's replace_if
and replace_copy_if
and remove_if
, but I'm not seeing any std algorithms that take a sequence and output a potentially-longer sequence. This would be basically insert_copy_if
, or even more generically, something like transform_items
:
//! Like transform, but TransformItem takes an element and an iterator and writes zero or more output elements:
template <typename InRange, typename OutIter, typename TransformItem>
OutIter transform_items(InRange&& inRange, OutIter out, TxFn transformItem) {
for (auto&& x : std::forward<InRange>(inRange)) {
out = transformItem(std::forwrad<decltype(x)>(x), out);
}
return out;
}
Then the escaping case would call transform_items(in, out, [shouldEsc, esc](auto c, auto out) { if (shouldEsc(c)) { *out = esc; } *out = c; })
.
Am I missing something, or is there nothing quite like that in the standard library?
CodePudding user response:
After asking on Twitter, I got some very satisfying answers, although it involves the M-word (monad).
First, @atorstling
suggests that this is flatMap
in JavaScript: https://twitter.com/atorstling/status/1574097704098988033?s=20&t=jzS503R6fMOqCajReygzJg https://dmitripavlutin.com/javascript-array-flatmap/
So translating to C , I think the answer to my question is that this should be called flat_transform
. With pedantic use of forwarding, that's:
//! Like transform, but TransformItem takes an element and an iterator and writes zero or more output elements:
template <typename InRange, typename OutIter, typename TransformOne>
OutIter flat_transform(InRange&& inRange,
OutIter out,
TransformOne transform_one) {
for (auto&& x : std::forward<InRange>(inRange)) {
out = transform_one(std::forwrad<decltype(x)>(x), std::move(out));
}
return out;
}
and yes, this appears to be a "missing algorithm" from the STL.
But additionally, @salociN001 pointed out that "It's basically the monadic bind operation: Transforming a Container of A with a function taking A and returning a container of A into a flat container of A." https://twitter.com/salociN001/status/1573918982557548545?s=20&t=jzS503R6fMOqCajReygzJg
So there we have it:
- It's monadic bind.
- A reasonable C name is
flat_transform
. - It is, in fact, a missing algorithm.