Find Minimum Score Possible-CodePudding

Problem statement:

We are given three arrays A1,A2,A3 of lengths n1,n2,n3. Each array contains some (or no) natural numbers (i.e > 0). These numbers denote the program execution times.
The task is to choose the first element from any array and then you can execute that program and remove it from that array.

For example:

if A1=[3,2] (n1=2),  
   A2=[7] (n2=1),  
   A3=[1] (n3=1)

then we can execute programs in various orders like [1,7,3,2] or [7,1,3,2] or [3,7,1,2] or [3,1,7,2] or [3,2,1,7] etc.

Now if we take S=[1,3,2,7] as the order of execution the waiting time of various programs would be
for S[0] waiting time = 0, since executed immediately,
for S[1] waiting time = 0 1 = 1, taking previous time into account, similarly,
for S[2] waiting time = 0 1 3 = 4
for S[3] waiting time = 0 1 3 2 = 6

Now the score of array is defined as sum of all wait times = 0 1 4 6 = 11, This is the minimum score we can get from any order of execution.
Our task is to find this minimum score.

How can we solve this problem? I tried with approach trying to pick minimum of three elements each time, but it is not correct because it gets stuck when two or three same elements are encountered.

One more example:
if A1=[23,10,18,43], A2=[7], A3=[13,42] minimum score would be 307.

CodePudding user response：

The simplest way to solve this is with dynamic programming (which runs in cubic time).

For each array A: Suppose you take the first element from array A, i.e. A[0], as the next process. Your total cost is the wait-time contribution of A[0] (i.e., A[0] * (total_remaining_elements - 1)), plus the minimal wait time sum from A[1:] and the rest of the arrays.

Take the minimum cost over each possible first array A, and you'll get the minimum score.

Here's a Python implementation of that idea. It works with any number of arrays, not just three.

def dp_solve(arrays: List[List[int]]) -> int:
    """Given list of arrays representing dependent processing times,
    return the smallest sum of wait_time_before_start for all job orders"""

    arrays = [x for x in arrays if len(x) > 0]  # Remove empty

    @functools.lru_cache(100000)
    def dp(remaining_elements: Tuple[int],
           total_remaining: int) -> int:
        """Returns minimum wait time sum when suffixes of each array
        have lengths in 'remaining_elements' """
        if total_remaining == 0:
            return 0
        rem_elements_copy = list(remaining_elements)
        best = 10 ** 20

        for i, x in enumerate(remaining_elements):
            if x == 0:
                continue
            cost_here = arrays[i][-x] * (total_remaining - 1)
            if cost_here >= best:
                continue
            rem_elements_copy[i] -= 1
            best = min(best,
                       dp(tuple(rem_elements_copy), total_remaining - 1)
                         cost_here)
            rem_elements_copy[i]  = 1

        return best

    return dp(tuple(map(len, arrays)), sum(map(len, arrays)))

Better solutions

The naive greedy strategy of 'smallest first element' doesn't work, because it can be worth it to do a longer job to get a much shorter job in the same list done, as the example of

A1 = [100, 1, 2, 3], A2 = [38], A3 = [34],
best solution = [100, 1, 2, 3, 34, 38]

by user3386109 in the comments demonstrates.

A more refined greedy strategy does work. Instead of the smallest first element, consider each possible prefix of the array. We want to pick the array with the smallest prefix, where prefixes are compared by average process time, and perform all the processes in that prefix in order.

A1              = [    100,         1,           2,             3]

Prefix averages = [(100)/1, (100 1)/2, (100 1 2)/3, (100 1 2 3)/4]
                = [  100.0,      50.5,      34.333,          26.5]

A2=[38]

A3=[34]

Smallest prefix average in any array is 26.5, so pick
the prefix [100, 1, 2, 3] to complete first.

Then [34] is the next prefix, and [38] is the final prefix.

And here's a rough Python implementation of the greedy algorithm. This code computes subarray averages in a completely naive/brute-force way, so the algorithm is still quadratic (but an improvement over the dynamic programming method). Also, it computes 'maximum suffixes' instead of 'minimum prefixes' for ease of coding, but the two strategies are equivalent.

def greedy_solve(arrays: List[List[int]]) -> int:
    """Given list of arrays representing dependent processing times,
    return the smallest sum of wait_time_before_start for all job orders"""

    def max_suffix_avg(arr: List[int]):
        """Given arr, return value and length of max-average suffix"""
        if len(arr) == 0:
            return (-math.inf, 0)
        best_len = 1
        best = -math.inf
        curr_sum = 0.0
        for i, x in enumerate(reversed(arr), 1):
            curr_sum  = x
            new_avg = curr_sum / i
            if new_avg >= best:
                best = new_avg
                best_len = i

        return (best, best_len)

    arrays = [x for x in arrays if len(x) > 0]  # Remove empty
    total_time_sum = sum(sum(x) for x in arrays)

    my_averages = [max_suffix_avg(arr) for arr in arrays]
    total_cost = 0

    while True:
        largest_avg_idx = max(range(len(arrays)),
                              key=lambda y: my_averages[y][0])

        _, n_to_remove = my_averages[largest_avg_idx]
        if n_to_remove == 0:
            break

        for _ in range(n_to_remove):
            total_time_sum -= arrays[largest_avg_idx].pop()
            total_cost  = total_time_sum

        # Recompute the changed array's avg
        my_averages[largest_avg_idx] = max_suffix_avg(arrays[largest_avg_idx])

    return total_cost