Crop a box around n percentile of maximum values-CodePudding

Given a binary image, how do I box around the majority of the white pixels? For example, consider the following image:

As canny segmentation results in a binary image, I thought I could use np.nonzero to identify the location of the points, and then draw a box around it. I have the following function to identify the location of the bounding box but its not working as intended (as you can see by the box in the image above):

def get_bounding_box(image,thresh=0.95):
    nonzero_indices = np.nonzero(image)
    min_row, max_row = np.min(nonzero_indices[0]), np.max(nonzero_indices[0])
    min_col, max_col = np.min(nonzero_indices[1]), np.max(nonzero_indices[1])
    box_size = max_row - min_row   1, max_col - min_col   1
    print(box_size)
    #box_size_thresh = (int(box_size[0] * thresh), int(box_size[1] * thresh))
    box_size_thresh = (int(box_size[0]), int(box_size[1]))
    #coordinates of the box that contains 95% of the highest pixel values
    top_left = (min_row   int((box_size[0] - box_size_thresh[0]) / 2), min_col   int((box_size[1] - box_size_thresh[1]) / 2))
    bottom_right = (top_left[0]   box_size_thresh[0], top_left[1]   box_size_thresh[1])
    print((top_left[0], top_left[1]), (bottom_right[0], bottom_right[1]))
    return (top_left[0], top_left[1]), (bottom_right[0], bottom_right[1])

and using the following code to get the coords and draw the box as follows:

seg= canny_segmentation(gray)
bb_thresh = get_bounding_box(seg,0.95)
im_crop = gray[bb_thresh[0][1]:bb_thresh[1][1],bb_thresh[0][0]:bb_thresh[1][0]]

why is this code not giving me the right top left / bottom right coordinates?

I have a example colab workbook here https://colab.research.google.com/drive/15TNVPsYeZOCiOB51I-geVXgGFyIp5PjU?usp=sharing

CodePudding user response：

I think that the top left and bottom right coordinates of the bounding box are not correctly calculated in the get_bounding_box function. The problem might lie in the calculation of top_left and bottom_right. The indices for the top left and bottom right coordinates of the bounding box should be calculated based on the min_row, max_row, min_col, max_col values, and not box_size_thresh.

Here's a corrected version of the code:

def get_bounding_box(image,thresh=0.95):
    nonzero_indices = np.nonzero(image)
    min_row, max_row = np.min(nonzero_indices[0]), np.max(nonzero_indices[0])
    min_col, max_col = np.min(nonzero_indices[1]), np.max(nonzero_indices[1])

    top_left = (min_row, min_col)
    bottom_right = (max_row, max_col)
    return top_left, bottom_right

Hope this helped!

CodePudding user response：

It turns out I needed to transpose the image before getting the coordinates, a simple .T did the trick

nonzero_indices = np.nonzero(image.T)