I'm writing a custom multi layer perceptron (MLP) implementation in C . All but the last layer share one activation function foo
, with the final layer having a separate activation bar
. I'm trying to write my code such that it's able to handle models of this type with a varying number of layers, like at this Godbolt link, reproduced below. Unfortunately, as written, I've had to hardcode the parameter pack for activation functions, and thus the code in the link only compiles for N = 5
.
Is there a way to create a custom parameter pack from the two activation functions which is able to "left-extend" the first argument, such that I can then compile the code above (after suitably updating the call to computeIndexedLayers
in computeMlp
? Specifically, I'm thinking of some utility which can yield parameter packs like:
template <size_t N, typename ActivationMid, typename ActivationLast>
struct makeActivationSequence {}; // What goes here?
makeActivationSequence<0>(foo, bar) -> []
makeActivationSequence<1>(foo, bar) -> [bar]
makeActivationSequence<2>(foo, bar) -> [foo, bar]
makeActivationSequence<3>(foo, bar) -> [foo, foo, bar]
makeActivationSequence<4>(foo, bar) -> [foo, foo, foo, bar]
...
Looking at the details of std::index_sequence I believe something similar might work here, but it's unclear to me how I'd modify that approach to work with the two different types.
Please note also that I'm specifically limited to C 14 here due to some toolchain issues, so solutions that take advantage of e.g. if constexpr
(as in the linked std::index_sequence details) won't work.
Code from the above Godbolt link, reproduced below for completeness:
#include <cstddef>
#include <utility>
#include <cstdio>
template <size_t LayerIndex, typename Activation>
void computeIndexedLayer(
const Activation& activation) {
printf("Doing work for layer %zu, activated output %zu\n", LayerIndex, activation(LayerIndex));
}
template <
std::size_t... index,
typename... Activation>
void computeIndexedLayers(
std::index_sequence<index...>, // has to come before Activation..., otherwise it'll get eaten
Activation&&... activation) {
(void)std::initializer_list<int>{
(computeIndexedLayer<index 1>(
std::forward<Activation>(activation)),
0)...};
}
template <size_t N, typename ActivationMid, typename ActivationLast>
void computeMlp(ActivationMid&& mid, ActivationLast&& last) {
computeIndexedLayers(std::make_index_sequence<N>(),
std::forward<ActivationMid>(mid),
std::forward<ActivationMid>(mid),
std::forward<ActivationMid>(mid),
std::forward<ActivationMid>(mid),
std::forward<ActivationLast>(last)
);
}
int main() {
computeMlp<5>([](const auto& x){ return x 1;}, [](const auto& x){ return x * 1000;});
// Doesn't compile with any other choice of N due to mismatched pack lengths
// computeMlp<4>([](const auto& x){ return x 1;}, [](const auto& x){ return x * 1000;});
}
CodePudding user response:
You can't return a parameter pack from functions, so makeActivationSequence
as you described is impossible. However, you can pass mid
and last
directly to computeIndexedLayers
, and there utilise pack unfolding pairing them with, respectively, midIndex
template parameter pack and lastIndex
template parameter (in this case, there's exactly one lastIndex
, so it's not a template parameter pack, but it's not hard to change/generalise if needed) deduced from two corresponding std::index_sequence
arguments. Like this:
#include <cstddef>
#include <utility>
#include <cstdio>
template <size_t LayerIndex, typename Activation>
void computeIndexedLayer(Activation&& activation) {
printf("Doing work for layer %zu, activated output %zu\n", LayerIndex, activation(LayerIndex));
}
template <std::size_t... midIndex, std::size_t lastIndex,
typename ActivationMid, typename ActivationLast>
void computeIndexedLayers(
std::index_sequence<midIndex...> midIdxs,
std::index_sequence<lastIndex> lastIdxs,
ActivationMid&& mid, ActivationLast&& last) {
(void)std::initializer_list<int>{
(computeIndexedLayer<midIndex 1>(mid), 0)...,
(computeIndexedLayer<lastIndex>(std::forward<ActivationLast>(last)), 0)};
}
template <size_t N, typename ActivationMid, typename ActivationLast>
void computeMlp(ActivationMid&& mid, ActivationLast&& last) {
computeIndexedLayers(std::make_index_sequence<N - 1>(), std::index_sequence<N>{},
std::forward<ActivationMid>(mid), std::forward<ActivationLast>(last));
}
int main() {
computeMlp<6>([](const auto& x){ return x 1;}, [](const auto& x){ return x * 1000;});
}
Also note that in computeMlp
both mid
and last
are forwarded, but at computeIndexedLayers
only last
is. It's done to avoid potential repeated move from mid
, which could cause troubles if ActivationMid
contains some state and is not a trivially movable type.
C 17
Since C 17 supports fold expressions, pretty ugly std::initializer_list
hack in computeIndexedLayers
can be replaced:
template <std::size_t... midIndex, std::size_t lastIndex,
typename ActivationMid, typename ActivationLast>
void computeIndexedLayers(
std::index_sequence<midIndex...> midIdxs,
std::index_sequence<lastIndex> lastIdxs,
ActivationMid&& mid, ActivationLast&& last) {
(computeIndexedLayer<midIndex 1>(mid), ...);
computeIndexedLayer<lastIndex>(std::forward<ActivationLast>(last));
}
C 20
Templated lambdas in C 20 let us get rid of computeIndexedLayers
altogether and deduce template parameters and parameter packs for lambda, defined and immediately invoked within computeMlp
:
template <size_t N, typename ActivationMid, typename ActivationLast>
void computeMlp(ActivationMid&& mid, ActivationLast&& last) {
[&]<std::size_t... midIndex, std::size_t lastIndex>(
std::index_sequence<midIndex...> midIdxs,
std::index_sequence<lastIndex> lastIdxs){
(computeIndexedLayer<midIndex 1>(mid), ...);
computeIndexedLayer<lastIndex>(std::forward<ActivationLast>(last));
}(std::make_index_sequence<N - 1>(), std::index_sequence<N>{});
}