Finding spots in a numpy matrix-CodePudding

I have the following Matrix made of 0s and 1s which I want to identify its spots(elements with the value 1 and connected to eachothers).

M = np.array([[1,1,1,0,0,0,0,0,0,0,0],
              [1,1,1,0,0,0,0,0,0,1,1],
              [1,1,1,0,0,0,0,0,0,1,1],
              [1,1,1,0,0,1,1,1,0,0,0],
              [0,0,0,0,0,1,1,1,0,0,0],
              [1,1,1,0,1,1,1,1,0,0,0],
              [1,1,1,0,0,1,1,1,0,0,0],
              [1,1,1,0,0,1,1,1,0,0,0]])

In the matrix there are four spots.

an example of my output should seem the following

spot_0 = array[(0,0),(0,1), (0,2), (1,0),(1,1), (1,2), (2,0),(2,1), (2,2), (3,0),(3,1), (3,2)]
Nbr_0 = 12
Top_Left = (0, 0)
and that is the same process for the other 3 spots

Does anyone know how can I identify each spot with the number of its elements and top_left element, using numpy functions ? Thanks

CodePudding user response：

You can use a connected component labeling to find the spots. Then, you can use np.max so to find the number of component and np.argwhere so to find the locations of each component. Here is an example:

# OpenCV provides a similar function
from skimage.measure import label

components = label(M)
# array([[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
#        [1, 1, 1, 0, 0, 0, 0, 0, 0, 2, 2],
#        [1, 1, 1, 0, 0, 0, 0, 0, 0, 2, 2],
#        [1, 1, 1, 0, 0, 3, 3, 3, 0, 0, 0],
#        [0, 0, 0, 0, 0, 3, 3, 3, 0, 0, 0],
#        [4, 4, 4, 0, 3, 3, 3, 3, 0, 0, 0],
#        [4, 4, 4, 0, 0, 3, 3, 3, 0, 0, 0],
#        [4, 4, 4, 0, 0, 3, 3, 3, 0, 0, 0]])

for i in range(1, np.max(components) 1):
    spot_i = np.argwhere(components == i)
    Nbr_i = len(spot_i)
    Top_Left_i = spot_i[0]

Note that Top_Left only make sense for a rectangular area. If they are not rectangular this point needs to be carefully defined.

Note also that this method is only efficient with few component. If there are many component, then it is better to replace the current loop by an iteration over the components array (in this case the output structure is stored in a list l and l[components[i,j]] is updated with the information found for all item location (i,j) of components). This last algorithm will be slow unless Numba/Cython are used to speed the process up.

CodePudding user response：

You could use skimage.measure.label or other tools (for instance, OpenCV or igraph) to create labels for connected components:

#from @Jérôme's answer
from skimage.measure import label
components = label(M)

# array([[1, 1, 1, 0, 0, 0, 0, 0, 0, 0, 0],
#        [1, 1, 1, 0, 0, 0, 0, 0, 0, 2, 2],
#        [1, 1, 1, 0, 0, 0, 0, 0, 0, 2, 2],
#        [1, 1, 1, 0, 0, 3, 3, 3, 0, 0, 0],
#        [0, 0, 0, 0, 0, 3, 3, 3, 0, 0, 0],
#        [4, 4, 4, 0, 3, 3, 3, 3, 0, 0, 0],
#        [4, 4, 4, 0, 0, 3, 3, 3, 0, 0, 0],
#        [4, 4, 4, 0, 0, 3, 3, 3, 0, 0, 0]])

In the later part you could create a one-dimensional view of image, sort values of pixels and find dividing points of sorted label values:

components_ravel = components.ravel()
c = np.arange(1, np.max(components_ravel)   1)
argidx = np.argsort(components_ravel)
div_points = np.searchsorted(components_ravel, c, sorter=argidx)

# Sorted label values are:
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 
# 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1
# 2, 2, 2, 2
# 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3, 3
# 4, 4, 4, 4, 4, 4, 4, 4, 4
# So you find indices that divides these groups:
# [47, 59, 63, 79]

After that you could split array of indices that sorts your one-dimensional view of image at these points and convert them into two-dimensional ones:

spots = []
for n in np.split(argidx, div_points)[1:]: #in case there are no zeros, cancel `[1:]`
    x, y = np.unravel_index(n, components.shape)
    spots.append(np.transpose([x, y]))

It creates a list of spot coordinates of each group:

[array([[1, 0], [1, 2], [0, 2], [0, 1], [1, 1], [0, 0], [2, 2], [2, 1], [2, 0], [3, 2], [3, 1], [3, 0]]),
 array([[2, 10], [1, 9], [2, 9], [1, 10]]),
 array([[6, 5], [7, 5], [7, 6], [7, 7], [6, 7], [6, 6], [3, 5], [4, 6], [3, 6], [4, 5], [3, 7], [5, 7], [5, 6], [4, 7], [5, 5], [5, 4]]),
 array([[5, 0], [5, 1], [5, 2], [6, 2], [7, 0], [6, 0], [6, 1], [7, 1], [7, 2]])]

Note that an order of pixels of each group is mixed. This is because np.argsort uses a sort which is not stable. You could fix it like so:

argidx = np.argsort(components_ravel, kind='stable')

In this case you'll get:

[array([[0, 0], [0, 1], [0, 2], [1, 0], [1, 1], [1, 2], [2, 0], [2, 1], [2, 2], [3, 0], [3, 1], [3, 2]]),
 array([[1, 9], [1, 10], [2, 9], [2, 10]]),
 array([[3, 5], [3, 6], [3, 7], [4, 5], [4, 6], [4, 7], [5, 4], [5, 5], [5, 6], [5, 7], [6, 5], [6, 6], [6, 7], [7, 5], [7, 6], [7, 7]]),
 array([[5, 0], [5, 1], [5, 2], [6, 0], [6, 1], [6, 2], [7, 0], [7, 1], [7, 2]])]