Count 1's between the intersection of two binary arrays-CodePudding

Given two binary arrays, how can one quickly find the number of 1's occurring between intersecting bits for both arrays?

For example, consider a pair of bitsets (i1, i2) and their intersection (i12):

i1  = [0001001000100100];
i2  = [0101000100010110];
i12 = [0001000000000100]; // (i1 & i2)

The intersecting bits are in position 3 and 13, so count the number of 1's in positions 0-2 and 4-12:

x1 = [0, 2] // results i1
x2 = [1, 2] // results for i2

Now generalize this idea to 32-bit integers. Toy example:

int i1[2] = {2719269390, 302235938};
int i2[2] = {2315436042, 570885266};

Toy example in binary:

i1      = [10100010 00010100 11000010 00001110, 00010010 00000011 11000001 00100010]
i2      = [10001010 00000010 11000000 00001010, 00100010 00000111 00000100 10010010]
i1 & i2 = [10000010 00000000 11000000 00001010, 00000010 00000011 00000000 00000010]

Expected results:

x1 = [2, 2, 0, 1, 1, 1, 0, 0, 4];
x2 = [2, 1, 0, 0, 0, 1, 1, 0, 3];

I can see a "brute-force" approach using __builtin_clz() to determine leading zeros in i1 & i2, shifting i1 and i2 right by that number, doing __builtin_popcount() for the shifted i1 and i2 results, and repeating the procedure until no intersections remain. My instinct suggests there may be a more elegant approach, as this involves a few temporaries, many instructions, and at least two logical branches.

C/C implementations are welcome, but I would be satisfied enough with conceptual perspectives. Suffice it to say this approach intends to remove a critical bottleneck in a popular program.

CodePudding user response：

std::popcount returns the number of active bits in a value. It is able to call intrinsic functions, if available.

constexpr auto get_popcount(auto list) {
    decltype(list) result;
    std::transform(list.begin(), list.end(), result.begin(), [](auto x) {
        return std::popcount(x);
    });
    return result;
}

int main() {
    constexpr auto i12 = std::array{
        0b10000010u, 0b00000000u, 0b11000000u, 0b00001010u, 
        0b00000010u, 0b00000011u, 0b00000000u, 0b00000010u};

    static_assert(get_popcount(i12) == std::array{ 2u, 0u, 2u, 2u, 1u, 2u, 0u, 1u });
}

CodePudding user response：

Can't test this right now but (I1 XOR I2) AND I1 should leave only the bits that you want to count, so:

constexpr auto get_popcount(auto i1, auto i2) {
    std::vector<std::pair(int,int)> res;
    std::transform (i1.begin(), i1.end(), i2.begin(), res.begin(), []
        (auto x, auto y){
            return std::tie(std::popcount((x^y)&x),std::popcount((x^y)&y));
    });
    return res;
}

Where the first item of the touple is the number of bits on I1 and the second on I2.