How can I count sequences that meet these constraints?-CodePudding

I am trying to count permutations of a sequence of I and O symbols, representing e.g. people entering (I for "in") and leaving (O for "out") a room. For a given n many I symbols, there should be exactly as many O symbols, giving a total length of 2*n for the sequence. Also, at any point in a valid permutation, the number of O symbols must be less than or equal to the number of I symbols (since it is not possible for someone to leave the room when it is empty).

Additionally, I have some initial prefix of I and O symbols, representing people who previously entered or left the room. The output should only count sequences starting with that prefix.

For example, for n=1 and an initial state of '', the result should be 1 since the only valid sequence is IO; for n=3 and an initial state of II, the possible permutations are

IIIOOO
IIOIOO
IIOOIO

for a result of 3. (There are five ways for three people to enter and leave the room, but the other two involve the first person leaving immediately.)

I'm guessing the simplest way to solve this is using itertools.permutations. This is my code so far:

n=int(input())  ##actual length will be 2*n
string=input()
I_COUNT=string.count("I")
O_COUNT=string.count("O")
if string[0]!="I":
 sys.exit()
if O_COUNT>I_COUNT:
 sys.exit()
perms = [''.join(p) for p in permutations(string)]
print(perms)

the goal is to get the permutation for whatever is left out of the string and append it to the user's input, so how can I append user's input to the remaining length of the string and get the count for permutation?

CodePudding user response：

@cache
def count_permutations(ins: int, outs: int):
    # ins and outs are the remaining number of ins and outs to process
    assert outs >= ins
    if ins == 0 :
        # Can do nothing but output "outs"
        return 1
    elif outs == ins:
        # Your next output needs to be an I else you become unbalanced
        return count_permutations(ins - 1, outs)
    else:
        # Your. next output can either be an I or an O
        return count_permutations(ins - 1, outs)   count_permutations(ins, outs - 1)

If, say you have a total of 5 Is and 5 Os, and you've already output one I, then you want: count_permutations(4, 5).

CodePudding user response：

I'm guessing the simplest way to solve this is using itertools.permutations

Sadly, this will not be very helpful. The problem is that itertools.permutations does not care about the value of the elements it's permuting; it treats them as all distinct regardless. So if you have 6 input elements, and ask for length-6 permutations, you will get 720 results, even if all the inputs are the same.

itertools.combinations has the opposite issue; it doesn't distinguish any elements. When it selects some elements, it only puts those elements in the order they initially appeared. So if you have 6 input elements and ask for length-6 combinations, you will get 1 result - the original sequence.

Presumably what you wanted to do is generate all the distinct ways of arranging the Is and Os, then take out the invalid ones, then count what remains. This is possible, and the itertools library can help with the first step, but it is not straightforward.

It will be simpler to use a recursive algorithm directly. The general approach is as follows:

At any given time, we care about how many people are in the room and how many people must still enter. To handle the prefix, we simply count how many people are in the room right now, and subtract that from the total number of people in order to determine how many must still enter. I leave the input handling as an exercise.
To determine that count, we count up the ways that involve the next action being I (someone comes in), plus the ways that involve the next action being O (someone leaves).
If everyone has entered, there is only one way forward: everyone must leave, one at a time. This is a base case.
Otherwise, it is definitely possible for someone to come in. We recursively count the ways for everyone else to enter after that; in the recursive call, there is one more person in the room, and one fewer person who must still enter.
If there are still people who have to enter, and there is also someone in the room right now, then it is also possible for someone to leave first. We recursively count the ways for others to enter after that; in the recursive call, there is one fewer person in the room, and the same number who must still enter.

This translates into code fairly directly:

def ways_to_enter(currently_in, waiting):
    if waiting == 0:
        return 1
    result = ways_to_enter(currently_in   1, waiting - 1)
    if currently_in > 0:
        result  = ways_to_enter(currently_in - 1, waiting)
    return result

Some testing:

>>> ways_to_enter(0, 1) # n = 1, prefix = ''
1
>>> ways_to_enter(2, 1) # n = 3, prefix = 'II'; OR e.g. n = 4, prefix = 'IIOI'
3
>>> ways_to_enter(0, 3) # n = 3, prefix = ''
5
>>> ways_to_enter(0, 14) # takes less than a second on my machine
2674440

We can improve the performance for larger values by decorating the function with functools.cache (lru_cache prior to 3.9), which will memoize results of the previous recursive calls. The more purpose-built approach is to use dynamic programming techniques: in this case, we would initialize 2-dimensional storage for the results of ways_to_enter(x, y), and compute those values one at a time, in such a way that the values needed for the "recursive calls" have already been done earlier in the process.

That direct approach would look something like:

def ways_to_enter(currently_in, waiting):
    # initialize storage
    results = [[0] * currently_in for _ in waiting]
    # We will iterate with `waiting` as the major axis.
    for w, row in enumerate(results):
        for c, column in enumerate(currently_in):
            if w == 0:
                value = 1
            else:
                value = results[w - 1][c   1]
                if c > 0:
                    value  = results[w][c - 1]
            results[w][c] = value
    return results[-1][-1]

CodePudding user response：

This is a dynamic programming problem.

Given the number of in and out operations, we can do the following:

If we're out of either ins or outs, we can only use operations of the other type. There is only one possible assignment.
If we have an equal number of ins or outs, we must use an in operation according to the constraints of the problem.
Finally, if we have more ins than outs, we can perform either operation. The answer, then, is the sum of the number of sequences if we choose to use an in operation plus the number of sequences if we choose to use an out operation.

This runs in O(n^2) time, although in practice the following code snippet can be made faster using a 2D-list rather than the cache annotation (I've used @cache in this case to make the recurrence easier to understand).

from functools import cache

@cache
def find_permutation_count(in_remaining, out_remaining):
    if in_remaining == 0 or out_remaining == 0:
        return 1
    elif in_remaining == out_remaining:
        return find_permutation_count(in_remaining - 1, out_remaining)
    else:
        return find_permutation_count(in_remaining - 1, out_remaining)   find_permutation_count(in_remaining, out_remaining - 1)
    
print(find_permutation_count(3, 3)) # prints 5

CodePudding user response：

The number of such permutations of length 2n corresponds to the n'th Catalan number, since valid permutations are in one-to-one correspondence with binary trees. This is easy to spot if you replace "I" with "(" and "O" with ")". For example, IIOIOO becomes (()()), a parenthesis tree with two leaf nodes. You validity condition ensures that every open parenthesis has a matching closing parenthesis after it, so we always get a well-formed binary tree.

Wikipedia gives a formula for Catalan numbers in terms of central binomial coefficients:

from math import comb

def count_permutations(n):
  return comb(2*n,n) // (n 1)

for i in range(1,10):
  print(i, count_permutations(i))

# 1 1
# 2 2
# 3 5
# 4 14
# 5 42
# 6 132
# 7 429
# 8 1430
# 9 4862