Home > OS >  C map fast 1,2,3 integers to hardcoded chars?
C map fast 1,2,3 integers to hardcoded chars?

Time:03-31

I need to map int values 1,2,3 to chars 'C', 'M', 'A'

Whats the fastest way (this will be called 100s times per sec 24/7)?

a macro or an inline function and bunch of ?: operators or ifs or switch? or an array?

CodePudding user response:

A lookup-table seems the most obvious approach as it is also branch-free:

inline constexpr char map(std::size_t i) {
  constexpr char table[] = "0CMA";
  // if in doubt add bounds checking for i, but it will cost performance
  return table[i];
}

Observe that with optimisation, the lookup table boils down to an integer constant.

Edit: You can shave of an additional instruction if you are less lazy than me and specify the lookup table thusly:

  constexpr char table[] = {0, 'M', 'C', 'A'};

CodePudding user response:

My proposal: (char)(0x414D4300 >> (i * 8))

Instead of 0x414D4300 you could write (('C' << 8) | ('M' << 16) | ('A' << 24)).

Only actual testing will tell you whether this is faster than bitmask's answer.

If this runs only 100 times per second, you're chasing a red herring, anyway. Even if you write this section in the dumbest way possible, it seems to be a million times faster than the rest of your program.

CodePudding user response:

You're optimizing your code prematurely. Let me prove it. Here's the naivest, slowest approach that is still reasonable:

#include <map>

char map(int i) {
    std::map<int, char> table;
    table[1] = 'C';
    table[2] = 'M';
    table[3] = 'A';

    return table[i];
}

It generates hundreds of lines of assembly instructions.

And let's time it for a million elements:

#include <iostream>
#include <map>
#include <chrono>
#include <cstdlib>
#include <vector>

char map(int i) {
    std::map<int, char> table;
    table[1] = 'C';
    table[2] = 'M';
    table[3] = 'A';

    return table[i];
}

int main() {
    int const count = 1'000'000;
    std::vector<int> elms(count);
    
    std::srand(0); // I know that rand is no good, but it's OK for this example
    for (int i = 0; i < count;   i) {
        elms[i] = std::rand() % 3   1;
    }

    auto beg = std::chrono::high_resolution_clock::now();
    volatile int do_not_optimize;
    for (int i = 0; i < count;   i) {
        do_not_optimize = map(elms[i]);
    }
    auto end = std::chrono::high_resolution_clock::now();
    auto duration = std::chrono::duration_cast<std::chrono::milliseconds>(end - beg);
    std::cout << "Duration: " << duration.count() << " ms" << std::endl;
}

Here's the output on my computer with lots of other programs running alongside:

Duration: 246 ms

This terrible implementation under terrible conditions still meets your needs 40'650x over. I bet you have way more to worry about elsewhere in your codebase. Remember to make it right, then make it fast only if it's not already fast enough.

  • Related