Home > database >  How can I update a range within an array with a sequence
How can I update a range within an array with a sequence

Time:08-14

Given an array of values, how can I update a range with a sequence within that array, efficiently?
Updates are performed multiple times. After all updates are performed, we can query any index of the array for its final value.

If we update a value of v at index i, every element at index j is increased with a value of
max { v - | i - j | , 0 }

For example.

array = {1,1,1,1,1,1}  

Now I do an update at index 4 with a value of 3 the resulting array will look like this:

array = {1,1,2,3,4,3}  


I want to perform both operations efficiently.

CodePudding user response:

You can't update a range of elements "efficiently". Questions like these are always about figuring out how to avoid updating a range of elements altogether.

To figure out this one, consider two operations:

  • INTEGRATE(A) takes an array and replaces every element A[i] with sum(A[0]...A[i]).
  • DIFF(A) takes an array and replaces every element with its difference from the previous element (the first element is left unaltered).

These operations have some important properties:

  • They are inverses: INTEGRATE(DIFF(A)) = DIFF(INTEGRATE(A)) = A for all arrays A; and
  • They are linear: If A = B C, then INTEGATE(A) = INTEGRATE(B) INTEGRATE(C), and similarly for DIFF.

Your final array is the sum of the original array, plus a whole bunch of those "triangle" arrays. Let's say it's A T1 T2 T3... etc.

Each one of those triangles has a whole bunch of non-zero elements, but watch what happens when you apply DIFF twice:

[0,0,1,2,3,2,1,0,0] -> [0,0,1,1,1,-1,-1,-1,0] -> [0,0,1,0,0,-2,0,0,1]

The result has only 3 non-zero elements. That gives us a way to calculate your final array quickly.

Let D(X) = DIFF(DIFF(X)) and let I(X) = INTEGRATE(INTEGRATE(X)). Then instead of calculating A T1 T2 T3..., you calculate I( D(A) D(T1) D(T2) D(T3)... )

Since all those D(Tx) have at most 3 non-zero elements, it's quick and easy to add them into the result.

CodePudding user response:

I'm deliberately explaining how to solve it, without giving you full code. This also handles the complex case of interleaved updates and lookups, but therefore is more complex than what Matter Timmermans came up with.

You obviously can't use an array as your representation. It makes lookups fast, but an update with value k will be an O(k) operation.

Our second try, is to just have a list of the updates. Now updates are O(1), but after m updates a lookup is O(m).

What we need is to have a way to store updates such that both adding an update and doing a lookup are fast.

The first step is to change an update from "update at a value" to "update a range by a linear rule". That is currently you say:

update at 4 by 3

Instead we'd say:

from 2 to 3:
    update by x - 2
from 4 to 5:
    update by 7 - x

This isn't yet a win. But it becomes one when you rewrite the ranges in terms of a standard set of intervals. First the original array

from 0 to 5 1   0x

Now the array after update:

from 0 to 5, 1   0x  
    from 2 to 3, -1   x
    from 4 to 5, 7 - x

This can be represented compactly in 2 arrays:

m = [0, 0,  1, 0, -1, 0]
b = [1, 0, -1, 0,  7, 0]

And as complicated as it feels, now both updates and lookups wind up with O(log(n)) work.

For example for a lookup:

def rising_binary (n):
    power = 1
    m = 0
    while m < n:
        if n & power:
            m  = power
            yield m
        power *= 2

...

answer = 0
for bin in rising_binary(k):
    answer  = m[bin] * k   b[bin]
  • Related