Home > Software design >  Can static polymorphism (templates) be used despite type erasure?
Can static polymorphism (templates) be used despite type erasure?

Time:07-23

Having returned relatively recently to C after decades of Java, I am currently struggling with a template-based approach to data conversion for instances where type erasure has been applied. Please bear with me, my nomenclature may still be off for C -natives.

This is what I am trying to achieve:

  • Implement dynamic variables which are able to hold essentially any value type
  • Access the content of those variables using various other representations (string, ints, binary, ...)
  • Be able to hold variable instances in containers, independent of their value type
  • Convert between variable value and representation using conversion functions
  • Be able to introduce new representations just by providing new conversion functions
  • Constraints: use only C -11 features if possible, no use of libraries like boost::any etc.

A rough sketch of this might look like this:

#include <iostream>
#include <vector>

void convert(const std::string &f, std::string &t) { t = f; }
void convert(const int &f, std::string &t) { t = std::to_string(f); }
void convert(const std::string &f, int &t) { t = std::stoi(f); }
void convert(const int &f, int &t) { t = f; }

struct Variable {
  virtual void get(int &i) = 0;
  virtual void get(std::string &s) = 0;
};
template <typename T> struct VariableImpl : Variable {
  T value;
  VariableImpl(const T &v) : value{v} {};
  void get(int &i) { convert(value, i); };
  void get(std::string &s) { convert(value, s); };
};

int main() {
  VariableImpl<int> v1{42};
  VariableImpl<std::string> v2{"1234"};

  std::vector<Variable *> vars{&v1, &v2};

  for (auto &v : vars) {
    int i;
    v->get(i);
    std::string s;
    v->get(s);

    std::cout << "int representation: " << i <<
        ", string representation: " << s << std::endl;
  }

  return 0;
}

The code does what it is supposed to do, but obvoiusly I would like to get rid of Variable::get(int/std::string/...) and instead template them, because otherwise every new representation requires a definition and an implementation with the latter being exactly the same as all the others.

I've played with various approaches so far, like virtual templated, methods, applying the CRDT with intermediate type, various forms of wrappers, yet in all of them I get bitten by the erased value type of VariableImpl. On one hand, I think there might not be a solution, because after type erasure, the compiler cannot possibly know what templated getters and converter calls it must generate. On the other hand I think i might be missing something really essential here and there should be a solution despite the constraints mentioned above.

CodePudding user response:

  • Implement dynamic variables which are able to hold essentially any value type
  • Be able to hold variable instances in containers, independent of their value type

These two requirements are quite challenging on its own. The class templates don't really encourage inheritance, and you already did the right thing to hold what you asked for: introduced a common base class for the class template, which you can later refer to in order to store pointers of the said type in a collection.

  • Access the content of those variables using various other representations (string, ints, binary, ...)
  • Be able to introduce new representations just by providing new conversion functions

This is where it breaks. Function templates assume common implementation for different types, while inheritance assumes different implementation for the same types.

You goal is to introduce different implementation for different types, and in order to make your requirements viable you have to switch to one of those two options instead (or put up with a number of functions for each case which you have already introduced yourself)

Edit:

One of the strategies you may employ to enforce inheritance approach is generalisation of the arguments to the extent where they can be used interchangeably by the abstract interface. E.g. you may wrap the converting arguments inside of a union like this:

struct Variable {
    
    struct converter_type {
        enum { INT, STRING } type;
        
        union {
            int* m_int;
            std::string* m_string;
        };
        
    };
    
    virtual void get(converter_type& var) = 0;
    virtual ~Variable() = default;
    
};

And then take whatever part of it inside of the implementation:

void get(converter_type& var) override {
    switch (var.type) {
        case converter_type::INT:
            convert(value, var.m_int);
            break;
        case converter_type::STRING:
            convert(value, var.m_string);
            break;
    }
}

To be honest I don't think this is a less verbose approach compared to just having a number of functions for each type combination, but i think you got the idea that you can just wrap your arguments somehow to cement the abstract class interface.

CodePudding user response:

This is a classical double dispatch problem. The usual solution to this problem is to have some kind of dispatcher class with multiple implementations of the function you want to dispatch (get in your case). This is called the visitor pattern. The well-known drawback of it is the dependency cycle it creates (each class in the hierarchy depends on all other classes in the hierarchy). Thus there's a need to revisit it each time a new type is added. No amount of template wizardry eliminates it.

You don't have a specialised Visitor class, your Variable serves as a Visitor of itself, but this is a minor detail.

Since you don't like this solution, there is another one. It uses a registry of functions populated at run time and keyed on type identification of their arguments. This is sometimes called "Acyclic Visitor".

Here's a half-baked C 11-friendly implementation for your case.

#include <map>
#include <vector>
#include <typeinfo>
#include <typeindex>
#include <utility>
#include <functional>
#include <string>
#include <stdexcept>

struct Variable
{
    virtual void convertValue(Variable& to) const = 0;
    virtual ~Variable() {};

    virtual std::type_index getTypeIdx() const = 0;

    template <typename K> K get() const;

    static std::map<std::pair<std::type_index, std::type_index>,
         std::function<void(const Variable&, Variable&)>>
             conversionMap;

    template <typename T, typename K>
    static void registerConversion(K (*fn)(const T&));
};


template <typename T>
struct VariableImpl : Variable
{
    T value;

    VariableImpl(const T &v) : value{v} {};
    VariableImpl() : value{} {}; // this is needed for a declaration of 
                                 // `VariableImpl<K> below
                                 // It can be avoided but it is 
                                 // a story for another day

    void convertValue(Variable& to) const override
    {
        auto typeIdxFrom = getTypeIdx();
        auto typeIdxTo = to.getTypeIdx();

        if (typeIdxFrom == typeIdxTo) // no conversion needed
        {
            dynamic_cast<VariableImpl<T>&>(to).value = value;
        }
        else
        {
            auto fcnIter = conversionMap.find({getTypeIdx(), to.getTypeIdx()});
            if (fcnIter != conversionMap.end())
            {
                fcnIter->second(*this, to);
            }
            else
                throw std::logic_error("no conversion");
        }
    }

    std::type_index getTypeIdx() const override
    {
        return std::type_index(typeid(T));
    }
};

template <typename K> K Variable::get() const
{
    VariableImpl<K> vk;
    convertValue(vk);
    return vk.value;
}

template <typename T, typename K>
void Variable::registerConversion(K (*fn)(const T&))
{
    // add a mutex if you ever spread this over multiple threads
    conversionMap[{std::type_index(typeid(T)), std::type_index(typeid(K))}] = 
        [fn](const Variable& from, Variable& to) {
            dynamic_cast<VariableImpl<K>&>(to).value = 
              fn(dynamic_cast<const VariableImpl<T>&>(from).value);
        };
}

Now of course you need to call registerConversion e.g. at the beginning of main and pass it each conversion function.

Variable::registerConversion(int_to_string);
Variable::registerConversion(string_to_int);

This is not ideal, but hardly anything is ever ideal.


Having said all that, I would recommend you revisit your design. Do you really need all these conversions? Why not pick one representation and stick with it?

CodePudding user response:

  1. Implement std::any. It is similar to boost::any.

  2. Create a conversion dispatcher based off typeids. Store your any alongside the conversion dispatcher.

  3. "new conversion functions" have to be passed to the dispatcher.

  4. When asked to convert to a type, pass that typeid to the dispatcher.

So we start with these 3 types:

using any = std::any; // implement this
using converter = std::function<any(any const&)>;
using convert_table = std::map<std::type_index, converter>;
using convert_lookup = convert_table(*)();

template<class T>
convert_table& lookup_convert_table() {
  static convert_table t;
  return t;
}

struct converter_any: any {
  template<class T,
    typename std::enable_if<
      !std::is_same<typename std::decay<T>::type, converter_any>::value, bool
    >::type = true
  >
  converter_any( T&& t ):
    any(std::forward<T>(t)),
    table(&lookup_convert_table<typename std::decay<T>::type>())
  {}
  converter_any(converter_any const&)=default;
  converter_any(converter_any &&)=default;
  converter_any& operator=(converter_any const&)=default;
  converter_any& operator=(converter_any&&)=default;
  ~converter_any()=default;
  converter_any()=default;

  convert_table const* table = nullptr;

  template<class U>
  U convert_to() const {
    if (!table)
      throw 1; // make a better exception than int
    auto it = table->find(typeid(U));
    if (it == table->end())
      throw 2; // make a better exception than int
    any const& self = *this;


    return any_cast<U>((it->second)(self));
  }
};

template<class Dest, class Src>
bool add_converter_to_table( Dest(*f)(Src const&) ) {
  lookup_convert_table<Src>()[typeid(Dest)] = [f](any const& s)->any {
    Src src = std::any_cast<Src>(s);
    auto r = f(src);
    return r;
  };
  return true;
}

now your code looks like:

const bool bStringRegistered = 
  add_converter_to_table( [](std::string const& f)->std::string{ return f; })
  && add_converter_to_table( [](std::string const& f)->int{ return std::stoi(f); });

const bool bIntRegistered = 
  add_converter_to_table( [](int const& i)->int{ return i; })
  && add_converter_to_table( [](int const& i)->std::string{ return std::to_string(i); });


int main() {
  converter_any v1{42};
  converter_any v2{std::string("1234")};
  std::vector<converter_any> vars{v1, v2}; // copies!

  for (auto &v : vars) {
    int i = v.convert_to<int>();
    std::string s = v.convert_to<std::string>();

    std::cout << "int representation: " << i <<
    ", string representation: " << s << std::endl;
  }

}

live example.

...

Ok, what did I do?

I used any to be a smart void* that can store anything. Rewriting this is a bad idea, use someone else's implementation.

Then, I augmented it with a manually written virtual function table. Which table I add is determined by the constructor of my converter_any; here, I know the type stored, so I can store the right table.

Typically when using this technique, I'd know what functions are in there. For your implementation we do not; so the table is a map from the type id of the destination, to a conversion function.

The conversion function takes anys and returns anys -- again, don't repeat this work. And now it has a fixed signature.

To add support for a type, you independently register conversion functions. Here, my conversion function registration helper deduces the from type (to determine which table to register it in) and the destination type (to determine which entry in the table), and then automatically writes the any boxing/unboxing code for you.

...

At a higher level, what I'm doing is writing my own type erasure and object model. C has enough power that you can write your own object models, and when you want features that the default object model doesn't solve, well, roll a new object model.

Second, I'm using value types. A Java programmer isn't used to value types having polymorphic behavior, but much of C works much better if you write your code using value types.

So my converter_any is a polymorphic value type. You can store copies of them in vectors etc, and it just works.

  • Related