Home > OS >  Find all possible subarrays of an array and sum them
Find all possible subarrays of an array and sum them

Time:02-11

Let's suppose the array is A={1,2,3} , now get all subarrays for this array. For each sub-array find the minimum in that sub-array, also find the sum of items in that sub-array. Finally add all these values. The input cannot be sorted as I want all possible subarrays.

Example:

Possible sub-arrays are:

{1} - min = 1, sum = 1 => min* sum = 1
{1,2} - min = 1, sum = 3 => min* sum = 3
{1,2,3} - min = 1, sum = 6 => min* sum = 6
{2} - min = 2, sum = 2 => min* sum = 4
{2,3} - min = 2, sum = 5 => min* sum = 10
{3} - min = 3, sum = 3 => min* sum = 9

Finally add all these values to get the result = 1 3 6 4 10 9 = 33.

constraints: array elements can range from 1 to 1000_000_000. Array size from 1 to 100_000. Return output as module 7 1000_000_000.

Here is my program with O(n^2). I want a better algorithm with lesser time complexity.

public int program(int[] A, int n) {
    int M = 7   1000_000_000;
    long total = 0;
    for (int i = 0; i < n; i  ) {
        long sum = 0;
        int min = A[i];
        for (int j = i; j < n; j  ) {
            int a = A[j];
            sum= (sum   a) % M;
            min = Math.min(min, a);
            total = (total   (min * sum) % M) % M;
        }
    }
    return (int) total;
}

CodePudding user response:

This problem can be solved in O(n) time and space.

The technique is to use the algorithm for all nearest smaller values to find the largest array whose minimum element is x, for all elements x. This first step has been asked before as a separate post; the method and explanations here are partially borrowed from my answer there.

To avoid problems with duplicates, we'll say that x is the minimum element of a subarray when it's the leftmost minimum, i.e. there is no previous element as small as x nor any future element strictly smaller than x within its subarray.

Using the algorithm for all nearest smaller values, for each index i with 0 <= i < n, we compute the index of the previous smaller or equal value to A[i] (or -1 if none exists), and the index of the next smaller value to A[i] (or n if none exists). This is equivalent to assuming that we've prepended -infinity and appended infinity to A, so I'll assume we've done so (without actually modifying A).


Now, we have to compute each element's individual contribution to the total sum. We will break this into two pieces: the contribution of A[i] from subarrays whose minimum is located at or left of i, and the contribution of A[i] from subarrays whose minimum is located strictly after i.

 1. For each position i in the array A, 
    let M = (m_0, m_1, ... m_r) be the indices of the 
    ordered sequence of (weak) minima we see, starting
    with m_0 = -1 and ending at m_r = i.
For example, if 
A = (5, 3, 9, 4, 6, 8, 10, 1, 2, 7)

then the sequence for A[5]==8 would be:
M = -1, 1, 3, 4, 5

corresponding to the minima 
-infinity, 3, 4, 6, 8

seen walking left from A[5]==8.
 2. Let C(m_j, i), 1 <= j <= r, be the count of 
    subarrays of A that have m_j as their minimum
    and contain A[i]. We can also let D(m_j, i) be
    this count of arrays times the shared minimum
    of the arrays: D(m_j, i) = A[m_j]*C(m_j, i).
 3. The contribution of A[i] to the total sum 
    from subarrays whose minimum is located 
    at or before i is the 
    sum over j, 1 <= j <= r, of D(m_j, i)*A[i].
    Note that this sum excludes the 
    fake minimum of -infinity

If the indices of the next_smaller element and previous_smaller_or_equal element to i have been computed, then

C(i, i) = (next_smaller[i] - i) 
        * (i - previous_smaller_or_equal[i]).

Now, what I've described so far is still an O(n^2) algorithm. The trick is that we can compute C(m_j, i 1) from C(m_j, i), and that we don't need to actually compute the individual values of C for previous minima; only the sum mentioned at the end of (3) in an above code block.

C(m_j, i 1) = C(m_j, i) - (m_j-m_{j-1}) 
                     if A[i 1] >= A[m_j], and

            = 0 
                     if A[i 1] < A[m_j].

This means that the values of our C function decrease at a fixed, linear rate independent of i until we reach a strictly smaller number than A[m_j], at which point the sum contribution from m_j has decreased to zero.

That suggests that we should keep a (weakly) monotonic stack as we scan through A, storing the (weak) minima we've seen from left to right together with the fixed linear rate at which its contribution to our D(m_j, i) function sum is decreasing. At A[i], we perform the linear decrease to our running sum, pop all elements from the top of the stack that are strictly greater than A[i] (also removing their contribution to the running sum's decrease), add A[i] * running sum to our total, and push A[i] on top of the stack.

Since each element of A is added and removed from the stack at most once, this runs in O(n) time.


The contribution of A[i] to the total sum from subarrays whose minimum is located strictly after i is computed completely analogously going right-to-left, except we keep a strictly monotonic stack, and don't count the contribution of A[i] from subarrays which have A[i] as a minimum; these have already been counted.


Here's the full code in Java, (including a basic Pair class, for brevity/compatibility; only element access is required)

public static int sub_sums(int[] A, int n) {
    /* Given an array of integers A, compute the
    sum over all subarrays B of min(B)*sum(B)
    by using nearest smaller values

    Run-time: O(n), and O(n) extra space
    */
    int MOD = 7   1000_000_000;

    var previous_smaller_or_equal = new int[n];
    var next_smaller = new int[n];

    for (int i = 0; i < n; i  ) {
        previous_smaller_or_equal[i] = i;
        int j = i-1;
        while (j >= 0 && A[j] > A[i]){
            j = previous_smaller_or_equal[j];
        }
        previous_smaller_or_equal[i] = j;
    }

    for (int i = n-1; i >= 0; i--) {
        next_smaller[i] = i;
        int j = i 1;
        while (j < n && A[j] >= A[i]){
            j = next_smaller[j];
        }
        next_smaller[i] = j;
    }

    long total = 0;
    long running_sum = 0;
    long running_sum_decrease = 0;

    var s = new Stack<Pair<Integer, Integer>>();

    for (int i = 0; i < n; i  ) {
        // Contributions of this element as minimum
        running_sum  = (A[i] 
                     * (next_smaller[i] - i) 
                     * (i - previous_smaller_or_equal[i]));

        running_sum -= running_sum_decrease;

        while (!s.empty() && s.peek().first > A[i]){
            var last = s.pop();
            running_sum_decrease -= last.second;
        }
        total  = A[i] * running_sum;

        running_sum_decrease  = (A[i] 
                              * (i - previous_smaller_or_equal[i]));

        s.push(new Pair<Integer, Integer>(A[i],
                        A[i] * (i - previous_smaller_or_equal[i])));
    }

    s.clear();
    running_sum = 0;
    running_sum_decrease = 0;

    for (int i = n-1; i >= 0; i--) {

        running_sum -= running_sum_decrease;
        total  = A[i] * running_sum;

        // Contributions of this element as minimum
        // Added afterwards now, since forward-pass already counted once
        running_sum  = ((A[i] 
                      * (next_smaller[i] - i)
                      * (i - previous_smaller_or_equal[i])));

        while (!s.empty() && s.peek().first >= A[i]){
            var last = s.pop();
            running_sum_decrease -= last.second;
        }

        running_sum_decrease  = A[i] * (next_smaller[i] - i);

        s.push(new Pair<Integer, Integer>(A[i],
                                          A[i] * (next_smaller[i] - i)));
    }

    return (int) (total % MOD);
}

class Pair<F, S> {
    public final F first;
    public final S second;

    public Pair(F first, S second) {
        this.first = first;
        this.second = second;
}}

CodePudding user response:

This problem can be solved in O(n log n).

Basic idea

In this explanation, a lot of off-by-one errors probably sneaked in, but it does not matter for the idea. Let m be the minimum of the array. Then any subarray that contains m would have m as a minimum, and every other sub-array would be either entirely before or entirely after m. Let n be the number of elements in the array, and let us treat m as an index. Then for every i < m let's count the subarrays that contain i'th element and have m'th element as a minimum. It is exactly i * (n - m), as there are i possible left edges before i and (n - m) possible edges after m. Each of such arrays would have A[m] as a minimum, and each element with i < m would contribute to the resulting sum with a factor of i * A[m] * (n - m). For i > m with similar reasoning we would get (n - i) * A[m] * m. Then we would update the array before m and after m recursively and then just loop over to get the result.

The problem is in the update. To add these i * A[m] * (n - m) to every possible i would result in a quadratic solution. But as one can notice, both sides are linear functions with respect to i, and this is an important property. Let's rewrite the function for the right side as i * -A[m] * m n * A[m] * m. So we would use a * i b expression for every index. If we have two linear functions a * i b and c * i d, their sum would, obviously, be (a c) * i (b d) - also a linear function. To conclude this optimization we would use a recursive function with two parameters: subarray currently processed and linear function to apply to each element.

The last problem to solve is finding the minimum in the subarray, but that is known as RMQ problem and can be solved in O(n log n).

The function

To speed things up we would not create subarrays, of course, but pass indices. The pseudocode for what we would like to achieve:

//This uses inclusive l and exclusive r
public int program(int[] A, int l, int r, int a, int b) {
  if (r - l == 0) {
    //array of size 0 can appear in recursive calls
    //if smallest element is first or last in its subarray
    return 0;
  }
  if (r - l == 1) {
    //for size one array there exists only one possible subarray
    //we also need to add the linear function
    return A[l] * A[l]   A[l] * (a * l   b);
  }
  //This is the call to your RMQ data structure
  int m = findMinIdxInSubArray(A, l, r);
  int result = 0;

  //to the left function would be (i   1 - l) * A[m] * (r - m) which expands to
  //i * A[m] * (r - m)   (1 - l) * A[m] * (r - m)
  result  = program(A, l, i, a   A[m] * (r - m), b   (1 - l) * A[m] * (r - m));
  
  //to the right function would be (r - i) * A[m] * (m - l   1) wich expands to
  //i * -A[m] * (m - l   1)   r * A[m] * (m - l   1)
  result  = program(A, m   1, r, a - A[m] * (m - l   1), b   r * A[m] * (m - l   1));

  //This is the case for the element m itself,
  //as it is not accounted in the previous calculations
  //the linear function also needs to be accounted here
  result  = A[m] * (m - l   1) * (r - m)   A[m] * (a * m   b);
  return result;
}

Now your old function would be written like:

public int program(int A[], int n) {
  //initialize your RMQ data structure here
  return program(A, 0, n, 0, 0);
}

Note that this implementation lacks the findMinIdxInSubArray function, as i mentioned, any RMQ data structure can be used here. Feel free to use anything from Fenwick tree to sparse table.

Example To explain this formulas a little bit better i would trace the execution of this program for an array A={3,2,1,4}

When the function is first called it has subarray from 0 inclusive to 4 not inclusive and no elements are accounted for now. The minimum of this subarray is element 1 at index 2, and m=2 therefore.

1 is minimum for all subarrays [i,j] such that i <= 2 and 2 <= j.

For element 4 to the right of 1 it appears in 3 arrays ([0,3], [1,3], [2,3]), and recursive call with function -3 * i 12 would be called. In this function call it is the array of size one, so we would account for subarray {4} with 16 and as it is index 3, it's linear function value is -3 * 3 12 = 3, so we would also add 4 * 3 = 12 to account for the arrays from the first function call.

For element 1 itself, it appears in 6 arrays ([0,2],[0,3],[1,2],[1,3],[2,2] and [2,3]) and in the last formula would be added as 1 * 3 * 2 as expected, contributing 6 to the total sum.

Now for the harder array to the left. Element 3 appears in exactly 2 such arrays ([0,2] and [0,3]) and would contribute with a factor of 2 (2 arrays multiplied by minimum value 1 each) and element 2 appears in exactly 4 such arrays([0,2], [0,3], [1,2] and [1,3]). For these two elements, we would call a program with a linear function of 2*i 2, which, as one can see, matches their contribution.

In this recursive call, we would find that 2 at index 1 is the smallest element. To the right of it, there are no elements, so the recursive call would return 0. 2 itself appears in two subarrays ([0,1] and [1,1]) and from them, it would contribute 8 (2 arrays value 2 in sum and 2 is minimum) to the total sum. But linear function mentions that we should also add 21 1=4 arrayminimums from the previous levels, which matches our computations previously with the contribution of 8. So for 2, we would add a total of 16 in this iteration.

To the left of the two, there is only one element, it is three, it appears in only one array, so its contribution factor should increase by 2 (one array, but minimum is two). The function passed recursively would be (2*i 2) (2*i 2) = (4*i 4). When we would process the subarray of one element 3 we would add contribution of 9 from the subarray {3} and another 3*(4*0 4) to account for arrays on previous levels, and total contribution would be another 21.

Therefore, recursive function would yield a sum of 21 16 6 28=71. Now lets validate this result by hand:

{3} - min=3, sum=3, min*sum=9
{3,2} - min=2, sum=5, min*sum=10
{3,2,1} - min=1, sum=6, min*sum=6
{3,2,1,4} - min=1, sum=10, min*sum=10
{2} - min=2, sum=2, min*sum=4
{2,1} - min=1, sum=3, min*sum=3
{2,1,4} - min=1, sum=7, min*sum=7
{1} - min=1, sum=1, min*sum=1
{1,4} - min=1, sum=5, min*sum=5
{4} - min=4, sum=4, min*sum=16

Total: 9 10 6 10 4 3 7 1 5 16=71

  • Related