Given a binary image, how do I box around the majority of the white pixels? For example, consider the following image:
As canny segmentation results in a binary image, I thought I could use np.nonzero to identify the location of the points, and then draw a box around it. I have the following function to identify the location of the bounding box but its not working as intended (as you can see by the box in the image above):
def get_bounding_box(image,thresh=0.95):
nonzero_indices = np.nonzero(image)
min_row, max_row = np.min(nonzero_indices[0]), np.max(nonzero_indices[0])
min_col, max_col = np.min(nonzero_indices[1]), np.max(nonzero_indices[1])
box_size = max_row - min_row 1, max_col - min_col 1
print(box_size)
#box_size_thresh = (int(box_size[0] * thresh), int(box_size[1] * thresh))
box_size_thresh = (int(box_size[0]), int(box_size[1]))
#coordinates of the box that contains 95% of the highest pixel values
top_left = (min_row int((box_size[0] - box_size_thresh[0]) / 2), min_col int((box_size[1] - box_size_thresh[1]) / 2))
bottom_right = (top_left[0] box_size_thresh[0], top_left[1] box_size_thresh[1])
print((top_left[0], top_left[1]), (bottom_right[0], bottom_right[1]))
return (top_left[0], top_left[1]), (bottom_right[0], bottom_right[1])
and using the following code to get the coords and draw the box as follows:
seg= canny_segmentation(gray)
bb_thresh = get_bounding_box(seg,0.95)
im_crop = gray[bb_thresh[0][1]:bb_thresh[1][1],bb_thresh[0][0]:bb_thresh[1][0]]
why is this code not giving me the right top left / bottom right coordinates?
I have a example colab workbook here https://colab.research.google.com/drive/15TNVPsYeZOCiOB51I-geVXgGFyIp5PjU?usp=sharing
CodePudding user response:
I think that the top left and bottom right coordinates of the bounding box are not correctly calculated in the get_bounding_box function. The problem might lie in the calculation of top_left and bottom_right. The indices for the top left and bottom right coordinates of the bounding box should be calculated based on the min_row, max_row, min_col, max_col values, and not box_size_thresh.
Here's a corrected version of the code:
def get_bounding_box(image,thresh=0.95):
nonzero_indices = np.nonzero(image)
min_row, max_row = np.min(nonzero_indices[0]), np.max(nonzero_indices[0])
min_col, max_col = np.min(nonzero_indices[1]), np.max(nonzero_indices[1])
top_left = (min_row, min_col)
bottom_right = (max_row, max_col)
return top_left, bottom_right
Hope this helped!
CodePudding user response:
It turns out I needed to transpose the image before getting the coordinates, a simple .T did the trick
nonzero_indices = np.nonzero(image.T)