Home > database >  Minimum Cost to Create String of 'A's by deleting from the side or the middle
Minimum Cost to Create String of 'A's by deleting from the side or the middle

Time:09-26

Suppose I have a string of A's and B's. I can either remove a character from either end of the string for a cost of 1, or I can remove any character from the middle for a cost of 2. What's the minimum cost to create a string of only A's?

For example, if I have the string "BBABAA", then the minimum cost to remove is 4, since I would remove the B's from the left for a cost of 2, then remove the singular B from the right for a cost of 2, resulting in a combined cost of 4.

I attempted a top-down solution with caching below:

def MRC_helper(s, count, cache):
    if count == 0:
        cache[s] = 0
        cache[s[::-1]] = 0
        return 0
    if s in cache:
        return cache[s]

    min_cost = len(s)
    for i in range(len(s)):
        new_count = count - int(s[i] == 'B')
        new_s = s[:i] s[i 1:]
        if i == 0 or i == len(s)-1:
            min_cost = min(min_cost, 1   MRC_helper(new_s, new_count, cache))
        elif s[i] == 'B':
            min_cost = min(min_cost, 2   MRC_helper(new_s, new_count, cache))

    cache[s] = min_cost
    cache[s[::-1]] = min_cost
    return min_cost

def minRemovalCost(s):
    min_cost = MRC_helper(s, s.count('B'), {})
    return min_cost

The idea is pretty simple: for each character that you can remove, try to remove it and calculate the cost to transform the resulting substring. Then calculate the minimum cost move and cache it. I also cache the string forwards and backwards since it's the same thing. My friend said that it could be done greedily, but I disagree. Can anybody shed some light on a better solution than mine?

CodePudding user response:

All Bs must be removed. If we remove one B from the middle, we split the string into two sections. For the section on the right, none of the characters could be deleted from the left since otherwise we would have deleted the B from the left at a lower cost for that deletion. Mirror for the section on the left. A section that cannot have deletions from one side can be solved by iterating from the side that allows deletion and at each point comparing the costs of that deletion vs a middle deletion. Precalculate the latter and meet in the optimal middle.

Example 1:

            BBABAA
              x
cost_l: --> 122444
cost_r:     642200 <--
               x
               
optimal:     22    = 4
              22   = 4
               40  = 4
                40 = 4
                 4 = 4

Example 2:

            ABBA
              x
cost_l: --> 0233
cost_r:     3320 <--
               x
               
optimal:       3 = 3
              30 = 3
            03   = 3
            3    = 3

In case it's unclear, the one-sided cost calculation is:

cost_l(0) ->
  0 if str[0] == 'A' else 1

cost_l(i) ->
  if str[i] == 'B':
    min(2   cost_l(i-1), i   1)
  else:
    cost_l(i - 1)

(Mirror for cost_r.)

CodePudding user response:

Let C be an array where C[i] Is the optimal cost for the substring s[0:i 1](assuming right sise Is exclusive].

Consider another game, exactly the same, EXCEPT eating from the right costs 2 instead of 1. Let D be the associated cost array of this new game, this cost is helpful to keep track of situations where we restrict ourselves to not eating the last element of a substring.

We compute C and D in a single pass loop as follows: Values for first character are:

C[0] = int(s1=='B' )# (1 if It Is B, 0 otherwise)
D[0] = int( s1=='B')

For i 1, if we eat the last character, the cost is 1 C[i] (we must eat last character if It is 'B') If we don't eat the last character,the cost equals D[i] since we can't eat from the right. So

C[i 1]=C[i] 1 if s[i 1]=='B' else min(C[i] 1,D[i])

and similarly for the second game, if we eat the last character, then either we pay 2 from eating it from the right, or we have to eat everything to the left of it as well. So

D[i 1]=min(D[i] 2, i 1) if s[i 1]=='B' else D[i].

So we iterate over i, and output C[len(s)]. This Is O(len(s)) in time and memory, although if we only keep the costs from the previous iteration It becomes constant in memory(unrecommended since we can no longer recover the optimal policy by backtracking).

  • Related