Home > Enterprise >  Is there a vectorized way to find maxes within labeled areas in NumPy?
Is there a vectorized way to find maxes within labeled areas in NumPy?

Time:08-20

I have a 2D array representing tree heights, where 0 is the ground. I have another array that's always the same size showing segmented and labeled trees, where a 0 label means ground, and a positive integer value represents a unique tree. Here are some slices of the data:

heights = array([[37.5 , 41.82, 42.18, 42.18, 42.18, 39.23, 40.68, 40.71, 40.71,
        40.19, 35.03, 41.41, 41.41, 41.41, 40.77, 32.23, 32.23, 32.23,
        31.45, 25.6 , 25.63, 30.12, 30.78, 30.78, 30.92],
       [37.5 , 37.5 , 41.82, 42.18, 41.78, 41.78, 40.68, 40.68, 40.68,
        40.19, 41.04, 41.41, 41.41, 41.41, 41.03, 32.23, 32.23, 32.23,
        31.25, 25.6 , 25.6 , 30.12, 30.12, 21.08, 30.88],
       [37.5 , 37.5 , 34.61, 41.78, 41.78, 25.6 , 39.14, 40.68, 38.79,
        38.79, 41.04, 41.04, 41.8 , 41.8 , 41.8 , 24.66, 24.66, 31.25,
        25.63, 26.24, 26.2 , 25.2 , 24.93, 21.03, 21.03],
       [34.53, 34.61, 34.61, 35.23, 35.23, 25.32, 25.32, 33.17, 33.17,
        38.86, 39.4 , 40.31, 41.8 , 41.8 , 41.8 , 41.17, 25.37, 26.77,
        27.32, 27.39, 27.39, 26.96, 25.2 , 28.68, 28.68],
       [34.53, 34.52, 36.5 , 36.58, 36.67, 36.67, 25.15, 33.17, 38.65,
        38.86, 39.4 , 39.53, 40.78, 41.17, 41.17,  0.  , 26.77, 27.09,
        27.39, 27.6 , 27.6 , 28.  , 28.16, 28.68, 28.68],
       [32.22, 36.45, 37.1 , 37.28, 37.28, 38.07, 30.98, 31.12, 38.65,
        38.65, 39.12, 39.4 , 40.78, 40.78,  0.  ,  0.  , 27.41, 27.72,
        27.72, 28.49, 28.49, 28.16, 28.34, 28.87, 28.68],
       [36.45, 37.1 , 37.1 , 37.28, 38.23, 38.23, 38.23, 33.61, 32.31,
        38.65, 38.65, 38.62, 39.01, 33.75, 34.65, 34.65, 27.41, 27.72,
        27.72, 28.49, 28.49, 28.49, 28.87, 30.31, 30.31],
       [35.71, 36.45, 37.1 , 30.96, 38.23, 38.23, 38.23, 33.61, 33.28,
        33.42, 33.5 , 33.5 , 33.51, 34.07, 34.65, 34.65, 27.36, 27.83,
        27.83, 28.49, 28.49, 28.43, 28.87, 31.82, 31.68],
       [14.44,  0.  ,  0.  ,  0.  , 21.41, 32.98, 33.61, 33.61, 34.27,
        34.8 , 34.8 , 33.5 , 33.4 , 34.07, 34.65, 34.65,  0.  , 27.83,
        27.83, 28.7 , 29.18, 29.18, 31.82, 31.82, 31.98],
       [13.46,  0.  ,  0.  , 21.41, 21.73, 31.36, 33.33, 33.33, 34.89,
        34.99, 34.99, 32.72, 33.4 , 33.8 , 33.8 ,  0.  ,  0.  ,  0.  ,
        28.7 , 28.7 , 29.64, 29.64, 31.82, 31.82, 35.82],
       [13.46,  0.  ,  0.  ,  0.  , 21.73, 31.36, 31.46, 35.81, 36.33,
        36.33, 36.33, 32.72, 33.37, 33.71, 33.71,  0.  ,  0.  ,  0.  ,
        28.7 , 29.64, 29.64, 29.77, 29.77, 29.77, 35.95],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  , 24.07, 31.57, 35.9 , 36.33,
        36.33, 36.33, 21.97, 32.72, 33.37, 33.37,  0.  ,  0.  ,  0.  ,
        28.36, 29.04, 29.64, 29.77, 29.77, 29.77, 35.95],
       [ 0.  ,  0.  ,  0.  ,  0.  , 22.09, 24.07, 23.92, 31.57, 35.9 ,
        36.33,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
        28.38, 29.53, 28.96, 28.96, 28.69, 29.19, 35.49],
       [ 0.  ,  0.  ,  0.  ,  0.  , 22.09, 22.09, 22.09,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
        29.53, 29.53, 29.82, 28.96, 28.73, 29.19, 29.19],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
        29.53, 30.12, 30.12, 29.82, 28.73,  0.  , 28.89],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  , 30.12, 30.12, 30.12, 28.94,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  , 30.12, 30.12, 29.82,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  , 28.65, 28.65,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ],
       [ 0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,
         0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ,  0.  ]], dtype=float32)
labeled_trees = array([[33, 33, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37,
        37, 37, 37, 37, 39, 39, 39, 39, 39],
       [33, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37,
        37, 37, 37, 37, 39, 39, 39, 39, 39],
       [33, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37,
        37, 37, 37, 39, 39, 39, 39, 39, 39],
       [33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37, 37,
        37, 37, 39, 39, 39, 39, 39, 39, 39],
       [33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37, 37,  0,
        39, 39, 39, 39, 39, 39, 39, 39, 39],
       [33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37,  0,  0,
        39, 39, 39, 39, 39, 39, 39, 39, 39],
       [33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37, 37,
        37, 39, 39, 39, 39, 39, 39, 39, 39],
       [33, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37, 37, 37, 37,
        37, 39, 39, 39, 39, 39, 39, 39, 39],
       [33,  0,  0,  0, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37, 37,
         0, 39, 39, 39, 39, 39, 39, 39, 39],
       [33,  0,  0, 33, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37,  0,
         0,  0, 39, 39, 39, 39, 39, 39, 39],
       [33,  0,  0,  0, 33, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37,  0,
         0,  0, 39, 39, 39, 39, 39, 39, 39],
       [ 0,  0,  0,  0,  0, 33, 33, 33, 33, 33, 33, 33, 37, 37, 37,  0,
         0,  0, 39, 39, 39, 39, 39, 39, 39],
       [ 0,  0,  0,  0, 33, 33, 33, 33, 33, 33,  0,  0,  0,  0,  0,  0,
         0,  0, 39, 39, 39, 39, 39, 39, 39],
       [ 0,  0,  0,  0, 33, 33, 33,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0, 39, 39, 39, 39, 39, 39, 39],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0, 39, 39, 39, 39, 39,  0, 39],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0, 39, 39, 39, 39,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0, 39, 39, 39,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0, 39, 39,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0],
       [ 0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,  0,
         0,  0,  0,  0,  0,  0,  0,  0,  0]], dtype=int32)

I'd like to find the max height within each labeled region. I have done this successfully with a for loop, but it's slow.

max_heights = {}
for label in list(np.unique(labeled_trees))[1:]:
    tree_height = np.amax(heights[labeled_trees == label])
    max_heights[str(label)] = tree_height

# max_heights = {'33': 42.18, '37': 41.8, '39': 35.95}

Is there a faster/vectorized/more efficient way of finding the max values within labeled regions of a numpy array? The ideal output would be a boolean array where the location of each max is True.

CodePudding user response:

This returns the ideal output you need, but it is not fast enough. On my machine, it needs about 60 µs:

def max_mask(labeled_trees, heights):
    cmp = labeled_trees.ravel()[None] != np.unique(labeled_trees)[1:, None]
    indices = np.ma.masked_array(np.broadcast_to(heights.ravel(), cmp.shape), cmp).argmax(-1)
    ret = np.zeros(heights.size, bool)
    ret[indices] = True
    return ret.reshape(heights.shape)

Some explanations:

  1. The first step is to use broadcast to return the comparison result of each value of np.unique(heights)[1:] with labeled_trees.ravel(), which will be a 3d array with the shape of (np.unique(heights)[1:].size, labeled_trees.size). The equivalent code is given below:
cmp = np.array([labeled_tree.ravel() != elem for elem in np.unique(heights)[1:]])
  1. The second step is to flatten the heights and broadcast it as the shape of cmp as the value of np.ma.masked_array, cmp as the mask, and then find argmax for the mask array, which will find out the position of the maximum value of the valid part for each sub array. The equivalent code is given below:
indices = np.array([np.ma.masked_array(heights, mask).argmax() for mask in cmp])
  1. The remaining steps are very simple. We have already got the position of the maximum value of each unique value range of heights. Just create a bool array of the same size and set the corresponding position to True, finally reshape and return.

Test:

>>> max_mask(labeled_trees, heights)
array([[...]])    # A bool array with shape (25, 25)
>>> heights[max_mask(labeled_trees, heights)]
array([42.18, 41.8 , 35.95])

CodePudding user response:

Check Below pure numpy implementation using enter image description here

  • Related