I want to write a program that takes 2 strings as input, s1 and s2, and determines which characters of s1 couldn't be part of a non contigous substring that is 2. So after inputting
123625421454
as s1, and 254
as s2, the program would output
0 1 0 0 1 1 1 1 0 1 1 1
, where 1 means that a character can be a part of the substring, and 0 where it cannot.
This problem is really irritating me, since I couldn't find any algorithm that would find the non-contigous sequence without extremely high execution time. Is there even a good solution for this, with less or equal O than O(N)? I have tried to use hashes and dynamic programming, but no algorithms that I used were good for this circumstance.
EDIT: To clarify, the idea is that you can remove x elements of s1 (x can be 0) and get s2. Elements, which cannot be part of s2 under any circumstances, should be marked as 0, while those that can, should be marked as 1. Hope this helps.
CodePudding user response:
A key observation is, if you reach a sub-substring prefix of length k up to a position, then you can have a sub-substring of any length less than k up to that position, simply by skipping some of the tail elements. Same holds for postfix. It might sound trivial, but it leads to the solution.
So you'd like to maximize the prefix from front and postfix from tail. Basically, this means that you visit s1
once from front and once the reverse (with reverse s2
). In both cases, you'd like to maximize the substring at each point, so you simply advance in s2
whenever you can. This gives you two array of size_t
values: longest possible prefix and postfix at the given point. If the sum of those >= the length of s2
, that means that these can be joined and the result is 1
; otherwise these cannot be joined and the result is 0
.