Home > Back-end >  How to implement zero-overhead Inversion of Control
How to implement zero-overhead Inversion of Control

Time:03-29

Almost every OOP programmer has been exposed to the concept of Inversion of control. In C , we can implement that principle with dynamic callbacks (i.e. functors such as lambdas and function pointers). But if we know at compile time what procedure we are to inject into the driver, theoretically I believe that there is a way to eliminate the overhead of function passing and invoking by composing the callbacks and the driver/signal/what-so-ever function into an "unrolled procedure". Here is an example.

For a GUI program, we have logic on window 1) setup, 2) loop, and 3) termination. We can inject code 1) after window setup, 2) in each render loop, 3) and before termination. A procedural approach is to write in this manner:

// Snippet 1:
init_window();
init_input_handler();
init_canvas();
init_socket();
while (!window_should_close()) {
  update_window();
  handle_input();
  draw_on_canvas();
  send_through_socket();
}
drop_input_handler();
drop_canvas();
drop_socket();
terminate_window();

OOP programmers pride ourselves in decoupling and proper abstraction. Instead, we write this:

// Snippet 2:
init_window();
on_window_init_signal.send();
while (!window_should_close()) {
  update_window();
  on_render_signal.send();
}
on_exit_signal.send();
terminate_window();

But this brings an unwanted overhead as said above. My question is: How can we utilize the C metaprogramming mechanisms to achieve zero-overhead inversion of control so that code in a similar form of snippet 2 can be transformed into snippet 1 statically (i.e. at compile time)?

EDIT: I can think of loop optimizations widely found in optimizers. Maybe this is a generalized version of that issue.

CodePudding user response:

"Zero Overhead" & "But if we know at compile time what procedure we are to inject into the driver, " is possible.

You can use a template class to pass the functions to call like that:

struct SomeInjects
{
    static void AtInit() { std::cout << "AtInit from SomeInjects" << std::endl; }
    static void AtHandleInput() { std::cout << "AtHandleInput from SomeInjects" << std::endl; }
    static void AtDraw() { std::cout << "AtDraw from SomeInjects" << std::endl; }
};

struct OtherInject
{
    static void AtInit() { std::cout << "AtInit from OtherInject" << std::endl; }
    static void AtHandleInput() { std::cout << "AtHandleInput from OtherInject" << std::endl; }
    static void AtDraw() { std::cout << "AtDraw from OtherInject" << std::endl; }
};

template < typename Mixin >
struct Win
{
    void Init()
    {    
        Mixin::AtInit();
    }    

    void HandleInput()
    {    
        Mixin::AtHandleInput();
    }    

    void Draw()
    {    
        Mixin::AtDraw();
    }    
};

int main()
{
    Win<SomeInjects> wsi; 
    wsi.Init();
    wsi.HandleInput();
    wsi.Draw();

    Win<OtherInject> wso;
    wso.Init();
    wso.HandleInput();
    wso.Draw();
}

But this has the drawback, that it needs static functions.

More elaborated try:

struct SomeInjects
{
    void AtInit() { std::cout << "AtInit from SomeInjects" << std::endl; }
    void AtHandleInput() { std::cout << "AtHandleInput from SomeInjects" << std::endl; }
    void AtDraw() { std::cout << "AtDraw from SomeInjects" << std::endl; }
};

struct OtherInject
{
    void AtInit() { std::cout << "AtInit from OtherInject" << std::endl; }
    void AtHandleInput() { std::cout << "AtHandleInput from OtherInject" << std::endl; }
    void AtDraw() { std::cout << "AtDraw from OtherInject" << std::endl; }
};

template < typename Mixin >
struct Win: Mixin
{
    void Init()
    {    
        this->AtInit();
    }    

    void HandleInput()
    {    
        this->AtHandleInput();
    }    

    void Draw()
    {    
        this->AtDraw();
    }    
};

int main()
{
    Win<SomeInjects> wsi; 
    wsi.Init();
    wsi.HandleInput();
    wsi.Draw();

    Win<OtherInject> wso; 
    wso.Init();
    wso.HandleInput();
    wso.Draw();
}

The last technique is called CRTP or Mixin.

If your compiler inlines all and everything depends on many things. But typically all calls are inlined if the called functions are not really to big.

But if you need any runtime changeable callbacks, you have to use some kind of callable representation. That can be function pointers or things like std::function. The last generates more or less always some minor overhead.

But remember: A simple dereferenced pointer is typically not the speed problem at all. More important is, that in such cases constants can not be propagated, the code can't be inlined and as a result an overall optimization is not longer possible. But if runtime flexibility is needed, it will have some cost. As always: Measure before optimize!

  • Related