Given a number, find the sequence to reach it-CodePudding

I am working on this challenge:

A bot can do either of the 2 instructions:

[F]orward: move ahead by 1 and double its speed

[B]ack: Stop, change direction and reset its speed.

The initial speed is 1.

Given a distance, return a sequence that will satisfy it. The sequence can only have a maximum of 3 'B's. If a solution is not possible with 3 'B's then return empty string.

For example,
findSeq(2) -> FFBF (1 2-1)
findSeq(3) -> FF (1 2)
findSeq(4) -> FFFBFF (1 2 4 - (1 2))

Here is what I have so far:

def findSeq(dist):
    curr = 0
    seq = ""
    speed = 1
    mul = 1
    
    while (curr != dist):
        seq  = 'F'
        curr  = (speed*mul)
        speed *= 2
        
        if curr > dist and mul != -1:
            seq  = 'B'
            mul = -1
            speed = 1

        elif curr < dist:
            mul = 1
        
        if curr == dist:
            return seq

        
        if seq.count('B') == 3 and curr != dist:
            return ''
    
    return seq

This works for findSeq(2), findSeq(3) and findSeq(4). However, it starts breaking for findSeq(5) -> (it gives FFFBFFFBFF)

I'm not sure what I'm missing.

CodePudding user response：

A couple of observations: A sequence of n F commands, moves the train 2^n - 1 in the current direction. Without loss of generality, all solutions are of the form F^k1 B F^k2 B F^k3 B F^k4, with k1, k2, k3, k4 >= 0 because if we don't to move the train, we can simply set some of the k's to 0.

This gives a restatement of the problem: given n, find non-negative k1, k2, k3, k4 such that (2^k1 - 1) - (2^k2 - 1) (2^k3 - 1) - (2^k4 - 1) = n.

The 1's cancel in the sum, so you need to find 2^k1 2^k3 - 2^k2 - 2^k4 = n.

If n has N bits, then without loss of generality, each of the four numbers can be at most N 1.

I'm sure there's a cleverer approach, but just doing a search of the (N 2)^4 possibilities yields a O((log n)^4) solution.

import itertools

def findSeq(n):
    N = max(1, n.bit_length())
    for k1, k2, k3, k4 in itertools.product(tuple(range(N 2)), repeat=4):
        if (1<<k1) - (1<<k2)   (1<<k3) - (1<<k4) == n:
            return 'B'.join('F' * x for x in (k1, k2, k3, k4))
    return ''

for i in range(100):
    print(i, findSeq(i))

@MattTimmermans suggests an O((logn)^2) improvement: enumerating all numbers of the form q = 2^k1 2^k3, and then seeing if q-n is of the same form.

def findSeq2(n):
    N = max(1, n.bit_length())
    b2 = {(1<<k1) (1<<k2): (k1, k2) for k1, k2 in itertools.product(tuple(range(N 2)), repeat=2)}
    for k1, k3 in b2.values():
        r = (1 << k1)   (1 << k3) - n
        if r in b2:
            k2, k4 = b2[r]
            return 'B'.join('F' * x for x in (k1, k2, k3, k4))      
    return ''

CodePudding user response：

The problem with your code was the second turn. Notice that when curr < dist you change direction but not resetting the speed and adding 'B' to the sequence. Changing

elif curr < dist:
    mul = 1

elif curr < dist and mul == -1:
    seq  = 'B'
    mul = 1
    speed = 1

Should solve your problem. In addition, you stated that the path can have 3 'B' at most, thus you need to change

if seq.count('B') == 3 and curr != dist:
    return ''

if seq.count('B') == 4 and curr != dist:
    return ''

Algorithm analysis

A few insight as to what distances this function can reach (limited by only three turns). When dist can be written as 2**n - 1 for any integer bigger than 0, this function will reach dist without any turns and the path will be 'F' * n. This because the binary representation of 2**n - 1 is a sequence of n '1'. Examples:

7 = 2**3-1 = 0b111 --> findSeq(0b111) = 'FFF'
15 = 2**4-1 = 0b1111 --> findSeq(0b1111) = 'FFFF'
31 = 2**5-1 = 0b11111 --> findSeq(0b11111) = 'FFFFF'

etc

To reach 5, which has a binary representation of 0b101 the function will do the following:

reach 7
turn around, return to 4 (which is a power of 2, 4 = 0b100)
turn around, advance to 5

print(findSeq(5))  # 'FFFBFFBF'

This means that the way your algorithm works is when starting from the Most Significant Bit (MSB) which is the leftmost-bit, for any change from 1 to 0, you will need to use 1 turn, and for another change from 0 to 1 you will need to use another turn. This happens because by the rules of the movement, a change from 1 to 0 in the binary representation, the function resets to the largest number which is smaller than dist and has a binary representation of s sequence of ones, followed by a sequence of zeros (e.g. 4 = 0b100 has 1 '1' followed by 2 '0').

This means that using your algorithm you can find a path to very large numbers, but for some small reachable numbers a path will not be found. Taking the very large number of 895, represented by 0b1101111111. You can see that there is only one change from '1' to '0' (bits 8 and 7 when the LSB is in location 0), the function will be able to find a path by:

reaching 1023 (0b1111111111)
return to 768 (0b1100000000), has 2 '1' followed by 8 '0'
advance to 895 (0b1101111111)

print(findSeq(0b1101111111))  # 'FFFFFFFFFFBFFFFFFFFBFFFFFFF'
print(findSeq(895))  # 'FFFFFFFFFFBFFFFFFFFBFFFFFFF'

Changing the dist to 892 will reults in an additional turn:

return to 892 (0b1101111100)

print(findSeq(0b1101111100))  # 'FFFFFFFFFFBFFFFFFFFBFFFFFFFBFF'
print(findSeq(892))  # 'FFFFFFFFFFBFFFFFFFFBFFFFFFFBFF'

But for relatively small numbers like 21 which is represented by 0b10101, the function will not find a path, although a valid path is given ('FFFFBFBFFF'):

Advance 4 steps to 15
Turn around, go back 1 step to 14
Turn around again, take 3 steps to 14 7 = 21

Solution suggestion

This is a nice example of graph search problem. In a graph search problem you first need to define the states. The states in this problem are a tuples where each tuple consists of the sequence and the current location, i.e. (seq, current_loc). After defining the states, any graph search algorithm can solve your problem, where the two most prominent solutions will be the BFS and DFS. Since I wanted to get the path with the shortest amount of steps, I implemented BFS. Please find more information here.

The BFS solution is as follows:

import math
def findSeq(dist):
    state_queue = []  # Pending states which have not been explored yet
    upper_lim   = 2**math.ceil(math.log2(dist))
    state = ['', 0]  # Starting state ; state is made out of the path at location [0], the 'B' count at location [1] and the distance at location [2]
    found = True
    while state[1] != dist:
        # adding states to the queue according to conditions:
        steps   = len(state[0].split('B')[-1])
        b_count = state[0].count('B')
        if 0 <= state[1] < upper_lim:  # advancing makes sense
            state_1 = [state[0]   'F', state[1]   ((-1)**b_count) * 2**steps]
            state_queue.append(state_1)
        if b_count < 3 and state[1] >= 0:
            state_2 = [state[0]   'B', state[1]]
            state_queue.append(state_2)
        # popping next state from the queue
        if len(state_queue) > 0:
            state = state_queue.pop(0)
        else:
            found = False
            break

    if found:
        return state[0]
    else:
        return ''

Please note that this is far from optimal since the states in the graph can be defined more efficiently and the queue can be implemented more efficiently, but this solves for all cases.

CodePudding user response：

I suggest an algorithm that derives the solution directly from the given number, without a search.

Observations

As others have also noted this comes down to finding terms of the form (2^a - 1) - (2^b - 1) (2^c - 1) - (2^d - 1).

Also, observe that when you have a solution, you can always append some "B" at the end, which do not change the represented distance. So we can say that exactly 3 "B" letters are needed, and all four terms (given above) are needed. This also means we can eliminate the -1 in each term, as they cancel each other.

When we look at the binary representation of those terms, we should have something like this:

0b1000000 - 0b100 0b10000 - 0b10

...where the number of trailing zeroes in those numbers is variable.

When playing with those terms it becomes clear that it is not possible to generate a value that -- in its binary representation -- has 4 groups of adjacent 1-digits (or more). For example: 0b1010101 cannot be encoded with "F" and three "B" instructions.

Case 1: the binary representation has one group of 1-bits

When there is only one adjacent group of 1-digits, it is easy.

For instance 0b11100 can be produced with FFFFFBFF. The first group of "F" is as long as there are digits, and the second group of "F" is as long as there are trailing zeroes.

Case 2: the binary representation has two groups of 1-bits

When there are two groups of 1-digits, it is also quite easy:

For instance 0b11001110 can be produced with FFFFFFFFBFFFFFFBFFFFBF. Note how each group of "F" is shorter than the previous one, and represents the number of binary digits that remain when a group of same-digits is removed from the left side. So:

11001110 translates to 8xF,
001110 to 6xF,
1110 to 4xF and
0 to 1xF.

Case 3: the binary representation has three groups of 1-bits

The case where there are three groups of 1-digits, is the most complex one. By analysing this in more depth it turns out that at least one separating group of 0 (that splits two groups of 1s) must have a length of 1 (i.e. just one separating zero). So for instance, there is no solution for 0b1001001, because both groups of 0s have size 2. Secondly, the remaining group of 1-digits that is not adjacent to this single 0, must also be single.

So: 0b1010011 has no solution either, because the right-side group (which is not adjacent to the single enclosed 0) has two 1s. But 0b11011001 has a solution, because there is an isolated 0 (between 1s) and the other remaining group of 1s is single too (at the right side).

Implementation

With these rules, it is possible to have a fast algorithm, as it does not really search for a solution, but derives it from the given number's binary representation:

import re

def encode(position):
    binary = bin(position)[2:]
    # Split in sizes of same digits. First group will consist of 1s
    sizes = list(map(len, re.findall(r"1 |0 ", binary)))
    if len(sizes) > 6:
        return ""  # Not possible
    # Create sequences of F, corresponding 
    #    to the number of binary digits that follow
    digits = ["F" * sizes[-1]]
    for size in reversed(sizes[:-1]):
        digits.append(digits[-1]   "F" * size)
    digits.reverse()
    # Simple cases:
    if len(sizes) <= 4:
        return "B".join(digits)
    # The case where there are 3 groups of 1s in the binary representation:
    digits.append("")  # Make sure that digits[5] is defined
    if sizes[0] == 1 and sizes[3] == 1:  # The isolated single zero is at group 3
        return "B".join((digits[1], digits[4], digits[2], digits[5]))
    if sizes[4] == 1 and sizes[1] == 1:  # The isolated single zero is at group 1
        return "B".join((digits[0], digits[2], digits[5], digits[3]))
    return ""  # Not possible.

Some driver code:

# Helper function to verify a solution
def decode(code):
    # Naive implementation according to rules
    position = 0
    direction = 1
    speed = 1
    for ch in code:
        if ch == "B":
            direction = -direction
            speed = 1
        else:
            position  = speed * direction
            speed *= 2
    return position


for i in range(1, 73):
    code = encode(i)
    decoded = decode(code)
    print(i, code, decoded)

The first natural number for which there is no solution, is 73, which has binary representation 0b1001001.