Home > OS >  Efficient approach to get all elements within a range in a 2d array?
Efficient approach to get all elements within a range in a 2d array?

Time:06-13

Consider the following 2 dimensional jagged array

[0,0] [0,1] [0,2]

[1,0] [1,1]

[2,0] [2,1] [2,2]

Lets say I want to know all elements that fall within a certain range for example [0,0]-[2,0], I would like a list of all of those elements, what would be the best approach for achieving this and are there any pre-existing algorithms that achieve this?

I have attempted to implement this in C# but didn't get much further than some for loops.

The example below better provides further detail upon what I would like to achieve.

Using the array defined above, lets say the start index is [0,0] and the end index of the range is [2,1] I would like to create a method that returns the values of all the indexes that fall within this range.

Expected results would be for the method to return the stored values for the following index. [0,0] [0,1] [0,2] [1,0] [1,1] [2,0] [2,1]

CodePudding user response:

If the 2d array is "sorted", meaning that as you go from left to right in each 1d array the y increases and as you go from up to down the x increases, you can find the first point and the last point that you need to report using binary searches in total time of O(logn), and after that report every point between those 2 points in O(k) where k is the number of points that you need to report (notice that you the tome complexity will be Omega(k) in every algorithm).

If the 2D array is not sorted and you just want to output all the pairs between pair A and pair B:

should_print = False
should_stop = False
for i in range(len(2dArray)):
    for j in range(len(2dArray[i]))
        should_print = (should_print or (2dArray[i][j] == A))
        if should_print:
            print(2dArray[i][j])
        should_stop = (2dArray[i][j] == B)
        if should_stop:
            break
    if should_stop:
        break

If you just have n general 2d points and you wish to answer the query "find me all the points in a given rectangle", there are 2 data structures that can help you - kd trees and range trees. These 2 data structures provide you a good query time but they are a bit complicated. I am not sure what is your current level but if you are just starting to get into DS and algorithms these data structures are probably an overkill.

Edit (a bit about range trees and kd trees):

First I'll explain the basic concept behind range trees. Lets start by trying to answer the range query for 1D points. This is easy - just build a bst (balanced search tree) and with it you can answer queries in O(logn k) where k is the number of points being reported. It takes O(nlogn) time to build the bst and it take O(n) space.

Now, let us try to take this solution and make it work for 2D. We will build a bst for the x coordinates of the points. For each node t in the bst denote all the points in the subtree of t by sub(t). Now, for every node t we will build a bst for the y coordinates of the points sub(t).

Now, given a range, we will for find all the subtrees contained in the x range using the first bst, and for each subtree we will find all the points contained in the y range (note that the bst corresponding the the subtree of sub(t) is saved at the node t).

A query takes O(log^2n) time. Building the DS takes O(nlog^2n) time and finally, it takes O(nlogn) space. I'll let you prove these statements. With more work the query time can be reduced to O(logn) and the building time can be reduced to O(nlogn). You can read about it here: http://www.cs.uu.nl/docs/vakken/ga/2021/slides/slides5b.pdf.

Now a word about KD trees. The idea is to split the 2D sapce in the middle with a vertical line, after that split each side in the middle with a horizontal line and so on. The query time in this DS will take O(sqrt(n)), the building time is O(nlogn) and the space it takes is O(n). You can read more about this DS here: http://www.cs.uu.nl/docs/vakken/ga/2021/slides/slides5a.pdf

  • Related