Home > Back-end >  Is there a binary search algorithm that takes a unary predicate rather than a search value?
Is there a binary search algorithm that takes a unary predicate rather than a search value?

Time:09-26

I have this exercise:

Given an array of integers, find the first missing positive integer in linear time and constant space. In other words, find the lowest positive integer that does not exist in the array. The array can contain duplicates and negative numbers as well.
For example, input [3, 4, -1, 1] should give 2 and input [1, 2, 0] should give 3.
You can modify the input array in-place.

 
My implementation:

template <typename In_It>
int missingPositiveInt(In_It first, In_It last){
    first = std::find_if( first, last, [](int x){return x > 0;} );
    if(first == last || *first > 1)
        return 1;
    for( auto next = first; (  next != last) && ( !(*next - *first > 1) ); )
          first;

    return *first   1;
}

int main(){

    std::vector<int> v{5, 2, -1, 7, 0};
    std::sort(v.begin(), v.end());
    std::cout << missingPositiveInt(v.cbegin(), v.cend()) << '\n';

    v = {2, -1, 1, 0};
    std::sort(v.begin(), v.end());
    std::cout << missingPositiveInt(v.cbegin(), v.cend()) << '\n';

    v = {5, 2, -1, 7, 0};
    std::sort(v.begin(), v.end());
    std::cout << missingPositiveInt(v.cbegin(), v.cend()) << '\n';

    v = {3, 4, -1, 1};
    std::sort(v.begin(), v.end());
    std::cout << missingPositiveInt(v.cbegin(), v.cend()) << '\n';

    v = {1, 2, 0};
    std::sort(v.begin(), v.end());
    std::cout << missingPositiveInt(v.cbegin(), v.cend()) << '\n';

    std::cout << '\n';
}

The output:

1
3
1
2
3

The program works just fine but I use the algorithm std::find_if to find the first positive value in the sequence (sorted sequence) and that algorithm does a linear search.

  • As long as the input sequence is already sorted I want to use some binary search algorithm to speed the process.

  • I tried using std::binary_search but it requires an argument to be searched for. What I need is to get a version that takes a unary predicate and applies a binary search or any other faster algorithm to find the lowest positive value in the sequence so I can write:

    auto it = binary_search(first, last, [](int x){ return x > 0; });
    

Is it possible? Is my code fine or I need to modify it. So any suggestion, hint is highly appreciated.

CodePudding user response:

Partial solution based on @numzero's answer. This doesn't handle negative numbers or zero in the array but you can handle that by linearly preprocessing the array to remove them beforehand. It just marks each index as "found" by negating it, then later looks for the first non negative value and thats the one. Even though its a partial solution it shows the core algorithm.

#include <iostream>
using namespace std;

int main() {
    
    int arr[] = {1, 4, 6, 7, 2, 7, 7, 8, 3};
    
    int arrSize = sizeof(arr)/sizeof(int);
    
    for(int i=0; i<arrSize;   i)
    {
        int val = abs(arr[i]);
        if(val > 0 && val-1 < arrSize)
        {
            if (arr[val-1]>0)
            {
                arr[val-1] = -arr[val-1];
            }
        }
    }
    
    for(int i=0; i<arrSize;   i)
    {
        if(arr[i] > 0)
        {
            std::cout << "Smallest is " << (i 1) << std::endl;
            return 0;
        }
    }
    
    std::cout << "Nothing found!" << std::endl;
    
    // your code goes here
    return 0;
}

CodePudding user response:

Yes, std::partition_point does exactly what you want.

CodePudding user response:

Reuse the source array as array of flags so that arr[k] stores whether k was in the original array. The MSB (a.k.a. sign bit) of each array entry may be a good storage.

I suggest a three-pass algorithm:

  1. Eliminate (zero-out, for example) all uninteresting (<=0, optionally also those >array size) elements.
  2. Store the flags (along the lines of arr[arr[k]] |= 1u<<31).
  3. Scan the flags, looking for first gap.

An important point here is that if some array entry is greater than array size, it can’t contribute to the answer as there is definitely some smaller positive integer missing.

  • Related