Home > Enterprise >  Find of N numbers K numbers xor equalling 0
Find of N numbers K numbers xor equalling 0

Time:02-19

My Problem is to find k Numbers in e.g. an Vector/Array with n Numbers, which equal calculated with xor 0. I tried already to brutforce all combinations without repetition, but with n=100 and k=10 the Time Complexity is to high. (about 1.7*10^13 possibilities)

Does anyone have any ideas, to speed up the process of finding the k Numbers in an Array with n Numbers?

I think my approach is not the smartest with such a high number of possibilities, but I really do not know how to make my program faster.

I would appreciate suggestions for a solution.

My program is in C .

SOLUTION comb(int N, int K, const std::vector<KEY>& bit_codes)
{
    std::vector<KEY> chosen_keys;
    std::vector<KEY> failed{KEY(100000, "Error")};
    std::string bitmask(K, 1);
    bitmask.resize(N, 0); // N-K trailing 0's

    std::vector<int> comb;
    // print integers and permute bitmask
    do {
        comb.clear();
        for (int i = 0; i < N;   i) // [0..N-1] integers
        {
            if (bitmask[i]){
                comb.emplace_back(i);
            }
        }
        //do all code till next permutation

        //reserve the size
        for (auto & i : comb) {
            chosen_keys.emplace_back(bit_codes[i]);
        }

        std::string string = chosen_keys[0].bits;
        std::string helper;
        for (size_t i = 1; i < chosen_keys.size();   i) {
            helper = to_string(to_bitset(string) ^ to_bitset(chosen_keys[i].bits));
            string = helper;
        }
        for(auto & item: bit_codes){
            if (item.bits == string){
                return {chosen_keys, item};
            }
        }

        chosen_keys.clear();

    } while (std::prev_permutation(bitmask.begin(), bitmask.end()));

    return {failed, KEY(100000, "Error")};
}

bitCodes = These are the numbers in a Vector in which I search for the k Numbers, where the xor of them equals 0 (KEY is a class in which bits are saved)

I use these binary Numbers: All bitCodes:

1.  001001101100101001000100101  
2.  101111001100011110101010000  
3.  001111010101110001101001100  
4.  111111100010110100010000001  
5.  000100000011001010001110000  
6.  110101111110101111011011111  
7.  110111011001000010010100110  
8.  111000111001000110101001011  
9.  111001100011011111000110010  
10. 101011001111110110101000111  
11. 100000110010100011010111011  
12. 101110000110011100001010101  
13. 011011110010010111010001101  
14. 100011101111000110100100001  
15. 100001101110001110010111011  
16. 011001011101011000010011110  
17. 101100000110110001011110100  
18. 000111111010001100010001111  
19. 001111101111011100010000010  
20. 001001000111111111011011001  

There are k=20 and I search n=5 numbers.
The output is:

3.  001111010101110001101001100  
4.  111111100010110100010000001  
6.  110101111110101111011011111  
10. 101011001111110110101000111 
12. 101110000110011100001010101  

...because they equal 0.

The code finished executing after about 100ms for 15504 Possibilities.

Now you can calculate yourself how long it will need to do n=100 and k=10.

CodePudding user response:

For k = 10, n = 100, a brute force approach isn't feasible, since there are over 17 trillion combinations:

comb(100,10) = 17310309456440

Since the xor of two equal values is zero, my first thought was to reduce the problem to k = 5, n = 100, a bit over 75 million combinations:

comb(100,5) = 75287520

storing each combination's xor sum, an index count (5), and 5 index values into an array of 75287520 structures. For odd k, such as k = 9, store the 3921225 combinations for k = 4, n = 100 into the array (index count = 4), followed by the 75287520 combinations for k = 5, n = 100 (index count = 5) into the array. To save space, store the number of indexes and the index values as 8 bit unsigned integers.

After creating the array of structures, sort it according to the xor sums. Scan the sorted array looking for equal xor sums (since xor of two equal values is zero), and no duplicates between the two sets of index values, which will be detected when merging the two sets of index values to create a single vector of k index values. If the goal is to find any single set that works, stop here. If the goal is to find all unique sets, use std::map using vector of k index values as the key and an 8 bit integer as the value (the value isn't used, but required by std::map). Note - std::map could consume a lot of space and time if there are a lot of unique sets that xor to zero.

Using a pseudo random number generator to create the array of 100 integers, for k = 10, n = 100, my test code found 4099 unique sets that xor to zero. On my laptop (Intel Core i7-10510U, Win 10 Pro 64 bit, Visual Studio 2019), with radix sort, it takes about 1.5 seconds total time: generate ~= 0.37 seconds, sort ~=0.88 seconds, scan ~= 0.25 seconds, using 1.7 GB of memory. With quick sort, it take about 6.5 seconds, sort ~= 5.88 seconds, using 0.9 GB of memory


#include <algorithm>
#include <ctime>
#include <iostream>
#include <iomanip>
#include <map>
#include <vector>

typedef unsigned char      uint8_t;
typedef unsigned short     unit16_t;
typedef unsigned int       uint32_t;
typedef unsigned long long uint64_t;

// N numbers xored K at a time

#define K 10
#define N 100

#define K0 (K/2)
#define K1 (K - K0)

typedef struct _SUMIDX {
    uint32_t sum;
    uint8_t cnt;
    uint8_t idx[K1];
}SUMIDX;

clock_t ctTimeBeg;                      // clock values
clock_t ctTimeMid;
clock_t ctTimeEnd;

//----------------------------------------------------------------------//
//      Comb(n,k)                                                       //
//----------------------------------------------------------------------//
static int Comb(int n, int k)
{
uint64_t dvnd = 1;
uint64_t dvsr = 1;
uint64_t quot;
    for(int i = n-k 1; i <= n; i  )
        dvnd *= i;
    for(int i = 1; i <= k; i  )
        dvsr *= i;
    quot = dvnd/dvsr;
    return (int) quot;
}

//----------------------------------------------------------------------//
//      InitCombination - init combination                              //
//----------------------------------------------------------------------//
void InitCombination(std::vector<uint8_t> &x, uint8_t n, uint8_t k) {
    for(uint8_t i = 0; i < k; i  )
        x[i] = i;
    --x[k-1];
}

//----------------------------------------------------------------------//
//      NextCombination - generate next combination                     //
//----------------------------------------------------------------------//
int NextCombination(std::vector<uint8_t> &x, uint8_t n, uint8_t k) {
uint8_t j = k - 1;
    while (j != (uint8_t)(-1) && x[j] == n - k   j)
        --j;
    if (j == (uint8_t)(-1))
        return 0;
      x[j];
    for (uint8_t i = j   1; i < k;   i)
        x[i] = x[j]   i - j;
    return 1;
}

//----------------------------------------------------------------------//
//      RadixSort                                                       //
//----------------------------------------------------------------------//
// sort a bin by 3 least significant bytes
void RadixSort3(std::vector<SUMIDX> &sumidx, std::vector<SUMIDX> &rdxsrt, 
                uint32_t beg, uint32_t end)
{
uint32_t mIndex[3][256] = {0};          // count / index matrix
uint32_t i,j,m,n;
SUMIDX u;
    std::swap(sumidx, rdxsrt);          // swap vectors
    for(i = beg; i < end; i  ){         // generate histograms
        u.sum = sumidx[i].sum;
        for(j = 0; j < 3; j  ){
            mIndex[j][(size_t)(u.sum & 0xff)]  ;
            u.sum >>= 8;
        }
    }
    for(j = 0; j < 3; j  ){             // convert to indices
        m = beg;
        for(i = 0; i < 256; i  ){
            n = mIndex[j][i];
            mIndex[j][i] = m;
            m  = n;
        }
    }
    for(j = 0; j < 3; j  ){             // radix sort
        for(i = beg; i < end; i  ){     //  sort by current lsb
            u = sumidx[i];
            m = (u.sum>>(j<<3))&0xff;
            rdxsrt[mIndex[j][m]  ] = u;
        }
        std::swap(sumidx, rdxsrt);      //  swap vectors
    }
}

// split vector into 256 bins according to most significant byte
void RadixSort(std::vector<SUMIDX> &sumidx, std::vector<SUMIDX> &rdxsrt)
{
uint32_t aIndex[260] = {0};             // count / array
uint32_t cnt = (uint32_t)sumidx.size();
uint32_t i;
    for(i = 0; i < cnt; i  )            // generate histogram
        aIndex[1 (sumidx[i].sum >> 24)]  ;
    for(i = 2; i < 257; i  )            // convert to indices
        aIndex[i]  = aIndex[i-1];
    for(i = 0; i < cnt; i  )            // sort by msb
        rdxsrt[aIndex[sumidx[i].sum>>24]  ] = sumidx[i];
    for(i = 256; i; i--)                // restore aIndex
        aIndex[i] = aIndex[i-1];
    aIndex[0] = 0;
    for(i = 0; i < 256; i  )            // radix sort the 256 bins
        RadixSort3(sumidx, rdxsrt, aIndex[i], aIndex[i 1]);
}

//----------------------------------------------------------------------//
//      QuickSort                                                       //
//----------------------------------------------------------------------//
#define ISZ (24)                // use insertion sort for <= ISZ elements

void InsertionSort(std::vector<SUMIDX> &sumidx, int lo, int hi)
{
    if(lo >= hi)
        return;
    SUMIDX t;
    int i, j;
    lo--;
    for (j = lo   2; j <= hi; j  ) {
        t = sumidx[j];
        i = j-1;
        while(i != lo && sumidx[i].sum > t.sum){
            sumidx[i 1] = sumidx[i];
            i--;
        }
        sumidx[i 1] = t;
    }
}

void QuickSort(std::vector<SUMIDX> &sumidx, int lo, int hi)
{
    while (hi - lo >= ISZ){
        uint32_t p = sumidx[lo   (hi - lo) / 2].sum;
        int i = lo - 1;
        int j = hi   1;
        while (1){
            while (sumidx[  i].sum < p);
            while (sumidx[--j].sum > p);
            if (i >= j)
                break;
            std::swap(sumidx[i], sumidx[j]);
        }
        if(j - lo < hi - j){
            QuickSort(sumidx, lo, j);
            lo = j 1;
        } else {
            QuickSort(sumidx, j 1, hi);
            hi = j;
        }
    }
    InsertionSort(sumidx, lo, hi);
}

void QuickSort(std::vector<SUMIDX> &sumidx)
{
    QuickSort(sumidx, 0, (int)sumidx.size());
}

//----------------------------------------------------------------------//
//      Rnd32 - return 32 bit random number                             //
//----------------------------------------------------------------------//
int Rnd32()
{
static unsigned int r = 0;
    r = r*1664525   1013904223;
    return r;
}

#define RDXSRT 1

int main(int argc, char**argv)
{
    int c, c0, c1;                          // combinations
    int sxi = 0;                            // index to sumidx
    c0 = Comb(N, K0);                       // generate # of combinations
    c1 = Comb(N, K1);
    c = c0;
    if (K0 != K1)
        c = c0   c1;
    std::vector<SUMIDX> sumidx(c);          // vector of sums and indexes
#if RDXSRT
    std::vector<SUMIDX> rdxsrt(c);          // vector for radix sort
#endif
    std::vector<int> v(N);                  // vector of numbers
    std::vector<uint8_t> x(K);              // vector of indexes
    std::map<std::vector<uint8_t>, uint8_t> m;            // map of sets of K indexes
    std::map<std::vector<uint8_t>, uint8_t>::iterator mi; // iterator for m
    for(int i = 0; i < N; i  )
        v[i] = Rnd32();
    ctTimeBeg = clock();
    // generate vector of sums and indexes
    InitCombination(x, N, K0);
    while (NextCombination(x, N, K0)) {
        int sum = 0;
        for (int i = 0; i < K0; i  )
            sum ^= v[x[i]];
        sumidx[sxi].sum = sum;
        sumidx[sxi].cnt = K0;
        for (int i = 0; i < K0; i  )
            sumidx[sxi].idx[i] = x[i];
        sxi  ;
        }
#if (K0 != K1)
    InitCombination(x, N, K1);
    while (NextCombination(x, N, K1)) {
        int sum = 0;
        for (int i = 0; i < K1; i  )
            sum ^= v[x[i]];
        sumidx[sxi].sum = sum;
        sumidx[sxi].cnt = K1;
        for (int i = 0; i < K1; i  )
            sumidx[sxi].idx[i] = x[i];
        sxi  ;
    }
#endif
    ctTimeMid = clock();
    std::cout << "# of ticks " << ctTimeMid - ctTimeBeg << std::endl;
    // sort the vector according to the sums
#if RDXSRT
    RadixSort(sumidx, rdxsrt);
#else
    QuickSort(sumidx);
#endif
#if 0
    std::sort(sumidx.begin(), sumidx.end(),
        [](SUMIDX s0, SUMIDX s1)
        {return s0.sum < s1.sum;});
#endif
    ctTimeEnd = clock();
    std::cout << "# of ticks " << ctTimeEnd - ctTimeMid << std::endl;
    ctTimeMid = ctTimeEnd;
    // scan the sorted vector for equal sums
    for (int i = 0; i < c - 1; i  ) {
        for (int j = i   1; j < c; j  ) {
            if (sumidx[i].sum != sumidx[j].sum)
                break;
#if (K0 != K1)
            if (K != (sumidx[i].cnt   sumidx[j].cnt))
                continue;
#endif
            //  merge indexes
            int ii = 0, jj = 0, im = 0;
            while (1) {
                // if duplicate indexes skip
                if (sumidx[i].idx[ii] == sumidx[j].idx[jj])
                    break;
                if (sumidx[i].idx[ii] < sumidx[j].idx[jj]) {
                    x[im  ] = sumidx[i].idx[ii  ];
                    if (ii >= sumidx[i].cnt) {
                        do
                            x[im  ] = sumidx[j].idx[jj  ];
                        while (jj < sumidx[j].cnt);
                        break;
                    }
                }
                else {
                    x[im  ] = sumidx[j].idx[jj  ];
                    if (jj >= sumidx[j].cnt) {
                        do
                            x[im  ] = sumidx[i].idx[ii  ];
                        while (ii < sumidx[i].cnt);
                        break;
                    }
                }
            }
            if (im != K)                // if duplicate indexes skip
                continue;
            m[x];                       // insert if unique set
        }
    }
    ctTimeEnd = clock();
    std::cout << "# of ticks " << ctTimeEnd - ctTimeMid << std::endl;
    std::cout << "# of ticks " << ctTimeEnd - ctTimeBeg << std::endl;
    std::cout << "number of combinations found " << m.size() << std::endl;
#if 1                                   // show and check what was found
    for (mi = m.begin(); mi != m.end(); mi  ) {
        x = mi->first;
        for (int i = 0; i < K; i  )
            std::cout << std::setw(3) << (uint32_t) x[i] << " ";
        std::cout << std::endl;
        int sum = 0;
        for (int i = 0; i < K; i  )
           sum ^= v[x[i]];
        if (sum != 0)
            std::cout << "error" << std::endl;
    }
#endif
    return(0);
}
  • Related