How to generate arbitrary high dimensional connectivity structures for scipy.ndimage.label-CodePudding

I have some high dimensional boolean data, in this example an array with 4 dimensions, but this is arbitrary:

X.shape
 (3, 2, 66, 241)

I want to group the dataset into connected regions of True values, which can be done with scipy.ndimage.label, with the aid of a connectivity structure which says which points in the array should be considered to touch. The default 2-D structure is a cross:

[[0,1,0],
 [1,1,1],
 [0,1,0]]

Which can be easily extended to high dimensions if all those dimensions are connected. However I want to programmatically generate such a structure where I have a list of which dims are connected to which:

#We want to find connections across dims 2 and 3 across each slice of dims 0 and 1:
dim_connections=[[0],[1],[2,3]]

#Now we want two separate connected subspaces in our data:
dim_connections=[[0,1],[2,3]]

For individual cases I can work out with hard-thinking how to generate the correct structuring element, but I am struggling to work out the general rule! For clarity I want something like:

mystructure=construct_arbitrary_structure(ndim, dim_connections)
the_correct_result=scipy.ndimage.label(X,structure=my_structure)

CodePudding user response：

The key to constructing an arbitrary structure for scipy.ndimage.label is to understand the concept of a neighborhood. A neighborhood is a set of points in the data that are considered to be connected. For example, in a 2D array, the neighborhood of a point (x,y) is the set of points {(x-1,y-1), (x-1,y), (x-1,y 1), (x,y-1), (x,y), (x,y 1), (x 1,y-1), (x 1,y), (x 1,y 1)}.

In order to construct an arbitrary structure for scipy.ndimage.label, we need to define a neighborhood for each point in the data. To do this, we need to define a set of connections between the dimensions of the data. For example, if we have a 4D array, and we want to connect dimensions 0 and 1, and dimensions 2 and 3, then our set of connections would be [[0,1],[2,3]].

Once we have defined our set of connections, we can construct our structure tensor. The structure tensor is a 3D array, where the first two dimensions correspond to the dimensions of the data, and the third dimension corresponds to the connections between the dimensions. For example, if we have a 4D array, and we want to connect dimensions 0 and 1, and dimensions 2 and 3, then our structure tensor would be of size (4,4,2).

The structure tensor is constructed by setting the elements of the third dimension to 1 if the corresponding dimensions are connected, and 0 otherwise. For example, if we have a 4D array, and we want to connect dimensions 0 and 1, and dimensions 2 and 3, then our structure tensor would be:

[[[1, 0],
  [0, 0],
  [0, 1],
  [0, 0]],
 
 [[0, 0],
  [1, 0],
  [0, 1],
  [0, 0]],
 
 [[0, 1],
  [0, 0],
  [1, 0],
  [0, 0]],
 
 [[0, 0],
  [0, 0],
  [0, 1],
  [1, 0]]]

Once we have constructed our structure tensor, we can pass it to scipy.ndimage.label to generate the connected regions of our data.

CodePudding user response：

This should work for you


def construct_arbitrary_structure(ndim, dim_connections):
    #Create structure array
    structure = np.zeros([3] * ndim, dtype=int)

    #Fill structure array
    for d in dim_connections:
        if len(d) > 1:
            # Set the connection between multiple dimensions
            for i in range(ndim):
                # Create a unit vector
                u = np.zeros(ndim, dtype=int)
                u[i] = 1

                # Create a mask by adding the connection between multiple dimensions
                M = np.zeros([3] * ndim, dtype=int)
                for j in d:
                    M  = np.roll(u, j)
                structure  = M
        else:
            # Set the connection for one dimension
            u = np.zeros(ndim, dtype=int)
            u[d[0]] = 1
            structure  = u

    #Make sure it's symmetric
    for i in range(ndim):
        structure  = np.roll(structure, 1, axis=i)

    return structure