Merge two bitmask with conflict resolving, with some required distance between any two set bits-CodePudding

I have two integer values: d_a = 6 and d_b = 3, so-called distance between set bits. Masks created with appropriate distance look like below:

uint64_t a = 0x1041041041041041; // 0001 0000 0100 0001 0000 0100 0001 0000
                                // 0100 0001 0000 0100 0001 0000 0100 0001

uint64_t b = 0x9249249249249249; // 1001 0010 0100 1001 0010 0100 1001 0010
                                 // 0100 1001 0010 0100 1001 0010 0100 1001

The goal is to have a target mask, which has bits set with d_b, but simultaneously takes into account bits set in the a mask (e.g. first set bit is shifted).

The second thing is that the distance in the target mask is not constant i.e. number of zeros between set bits in the target mask shall be equal to d_b or increased whenever between them is set bit in a

uint64_t target = 0x4488912224488912; // 0100 0100 1000 1000 1001 0001 0010 0010
                                      // 0010 0100 0100 1000 1000 1001 0001 0010

The picture to visualize the problem:
target mask

The blue bar is a, yellow is b. I would rather use bit manipulation intrinsics than bit-by-bit operations.

edit: Actually, I have the following code, but I am looking for a solution with a fewer number of instructions.

void set_target_mask(int32_t d_a, int32_t d_b, int32_t n_bits_to_set, uint8_t* target)
{
    constexpr int32_t n_bit_byte = std::numeric_limits<uint8_t>::digits;

    int32_t null_cnt = -1;

    int32_t n_set_bit = 0;
    int32_t pos = 0;
    while(n_set_bit != n_bits_to_set)
    {
        int32_t byte_idx = pos / n_bit_byte;
        int32_t bit_idx = pos % n_bit_byte;

        if(pos % d_a == 0)
        {
            pos  ;
            continue;
        }
        null_cnt  ;

        if(null_cnt % d_b == 0)
        {
            target[byte_idx] |= 1 << bit_idx;
            n_set_bit  ;
        }
        pos  ;
    }
}

CodePudding user response：

If target is uint64_t, possible d_a and d_b can be converted into bit masks via look-up table. Like lut[6] == 0x2604D5C99A01041 from your question.

Look up tables can be initialized once per program run during initlalization, or in compile time using macro or constant expressions (constexpr).

To make d_b spread, skipping d_a bits, you can use pdep with inverted d_a:

 uint64_t tmp = _pdep_u64(d_b_bits, ~d_a_bits);

Then you can convert n_bits_to_set to contiguous bits mask:

 uint64_t n_bits = (1 << n_bits_to_set) - 1;

And spread them using pdep again:

 uint64_t tmp = _pdep_u64(n_bits, tmp);

(See Intrinsic Guide about pdep. Note that pdep is slow on AMD before Zen3. It's fast on Intel CPUs and Zen3, but not Bulldozer-family or Zen1/Zen2)