Smallest Binary String not Contained in Another String-CodePudding

The question description is relatively simple, an example is given

input: 10100011
output: 110

I have tried using BFS but I don't think this is an efficient enough solution (maybe some sort of bitmap sliding window solution?)

string IntToString(int a)
{
    ostringstream temp;
    temp << a;
    return temp.str();
}

bool is_subsequence(string& s, string& sub) {
    if(sub.length() > s.length()) return false;
    int pos = 0;
    for(char c : sub)
    {
        pos = s.find(c, pos);
        if(pos == string::npos) return false;
          pos;
    }
    return true;
}

string shortestNotSubsequence(string& s) {
    Queue q(16777216);
    q.push(0);
    q.push(1);
    while(!q.empty())
    {
        string str;
        int num = q.front; q.pop();
        str = IntToString(num);
        if(!is_subsequence(s, str)) return str;
        string z = str   '0';
        string o = str   '1';
        q.push(stoi(str '0'));
        q.push(stoi(str '1'));
    }
    return "";
}

int main() {
    string N;
    cin >> N;
    cout << shortestNotSubsequence(N) << endl;
    return 0;
}

CodePudding user response：

You can do this pretty easily in O(N) time.

Let W = ceiling(log₂(N 1)), where N is the length of the input string S.

There are 2^W possible strings of length W. S must have less than N of them as substrings, and that's less than 2^W, so at least one string of length W must not be present in S.

W is also less than the number of bits in a size_t, and it only takes O(N) space to store a mask of all possible strings of length W. Initialize such a mask to 0s, and then iterate through S using the lowest W bits in a size_t as a sliding window of the substrings you encounter. Set the mask bit for each substring you encounter to 1.

When you're done, scan the mask to find the first 0, and that will be a string of length W that's missing.

There may also be shorter missing strings, though, so merge the mask bits in pairs to make a mask for the strings of length W-1, and then also set the mask bit for the last W-1 bits in S, since those might not be included in any W-length string. Then scan the mask for 0s to see if you can find a shorter missing string.

As long as you keep finding shorter strings, keep merging the mask for smaller strings until you get to length 1. Since each such operation divides the mask size in 2, that doesn't affect the overall O(N) time for the whole algorithm.

Here's an implementation in C

#include <string>
#include <vector>
#include <algorithm>

std::string shortestMissingBinaryString(const std::string instr) {
    const size_t len = instr.size();
    if (len < 2) {
        if (!len || instr[0] != '0') {
            return std::string("0");
        }
        return std::string("1");
    }
    // Find a string size guaranteed to be missing
    size_t W_mask = 0x3;
    unsigned W = 2;
    while(W_mask < len) {
        W_mask |= W_mask<<1;
        W =1;
    }

    // Make a mask of all the W-length substrings that are present
    std::vector<bool> mask(W_mask 1, false);
    size_t lastSubstr=0;
    for (size_t i=0; i<len;   i) {
        lastSubstr = (lastSubstr<<1) & W_mask;
        if (instr[i] != '0') {
            lastSubstr |= 1;
        }
        if (i 1 >= W) {
            mask[lastSubstr] = true;
        }
    }

    //Find missing substring of length W
    size_t found = std::find(mask.begin(), mask.end(), false) - mask.begin();

    // try to find a shorter missing substring
    while(W > 1) {
        unsigned testW = W - 1;
        W_mask >>= 1;
        // calculate masks for length testW 
        for (size_t i=0; i<=W_mask; i  ) {
            mask[i] = mask[i*2] || mask[i*2 1];
        }
        mask.resize(W_mask 1);
        // don't forget the missing substring at the end
        mask[lastSubstr & W_mask] = true;

        size_t newFound = std::find(mask.begin(), mask.end(), false) - mask.begin();
        if (newFound > W_mask) {
            // no shorter string
            break;
        }
        W = testW;
        found = newFound;
    }

    // build the output string
    std::string ret;
    for (size_t bit = ((size_t)1) << (W-1); bit; bit>>=1) {
        ret.push_back((found & bit) ? '1': '0');
    }
    return ret;
}