I want to detect the circle and the five squares in this image:
This is the relevant part of the code I currently use:
# detect shapes in black-white RGB formatted cv2 image
def detect_shapes(img, approx_poly_accuracy=APPROX_POLY_ACCURACY):
res_dict = {
"rectangles": [],
"squares": []
}
vis = img.copy()
shape = img.shape
height, width = shape[0], shape[1]
total_area = height * width
# Morphological closing: get rid of holes
# img = cv2.morphologyEx(img, cv2.MORPH_CLOSE, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (5, 5)))
# Morphological opening: get rid of extensions at the border of the objects
# img = cv2.morphologyEx(img, cv2.MORPH_OPEN, cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (121, 121)))
img = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
# cv2.imshow('intermediate', img)
# cv2.waitKey(0)
contours, hierarchy = cv2.findContours(img, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
logging.info("Number of found contours for shape detection: {0}".format(len(contours)))
# vis = img.copy()
# cv2.drawContours(vis, contours, -1, (0, 255, 0), 2)
cv2.imshow('vis', vis)
cv2.waitKey(0)
for contour in contours:
area = cv2.contourArea(contour)
if area < MIN_SHAPE_AREA:
logging.warning("Area too small: {0}. Skipping.".format(area))
continue
if area > MAX_SHAPE_AREA_RATIO * total_area:
logging.warning("Area ratio too big: {0}. Skipping.".format(area / total_area))
continue
approx = cv2.approxPolyDP(contour, approx_poly_accuracy * cv2.arcLength(contour, True), True)
cv2.drawContours(vis, [approx], -1, (0, 0, 255), 2)
la = len(approx)
# find the center of the shape
M = cv2.moments(contour)
if M['m00'] == 0.0:
logging.warning("Unable to compute shape center! Skipping.")
continue
x = int(M['m10'] / M['m00'])
y = int(M['m01'] / M['m00'])
if la < 3:
logging.warning("Invalid shape detected! Skipping.")
continue
if la == 3:
logging.info("Triangle detected at position {0}".format((x, y)))
elif la == 4:
logging.info("Quadrilateral detected at position {0}".format((x, y)))
if approx.shape != (4, 1, 2):
raise ValueError("Invalid shape before reshape to (4, 2): {0}".format(approx.shape))
approx = approx.reshape(4, 2)
r_check, data = check_rect_or_square(approx)
blob_data = {"position": (x, y), "approx": approx}
blob_data.update(data)
if r_check == 2:
res_dict["squares"].append(blob_data)
elif r_check == 1:
res_dict["rectangles"].append(blob_data)
elif la == 5:
logging.info("Pentagon detected at position {0}".format((x, y)))
elif la == 6:
logging.info("Hexagon detected at position {0}".format((x, y)))
else:
logging.info("Circle, ellipse or arbitrary shape detected at position {0}".format((x, y)))
cv2.drawContours(vis, [contour], -1, (0, 255, 0), 2)
cv2.imshow('vis', vis)
cv2.waitKey(0)
logging.info("res_dict: {0}".format(res_dict))
return res_dict
The problem is: if I set the approx_poly_accuracy
parameter too high, the circle is detected as a polygon (Hexagon or Octagon, for example). If I set it too low, the squares are not detected as squares, but as Pentagons, for example:
The red lines are the approximated contours, the green lines are the original contours. The text is detected as a completely wrong contour, it should never be approximated to this level (I don't care about the text so much, but if it is detected as a polygon with less than 5 vertices, it will be a false positive).
For a human, it is obvious that the left object is a circle and that the five objects on the right are squares, so there should be a way to make the computer realize that with high accuracy too. How do I modify this code to properly detect all objects?
What I already tried:
- Apply filters like
MedianFilter
. It made things worse, because the rounded edges of the squares promoted them being detected as a polygon with more than four vertices. - Variate the
approx_poly_accuracy
parameter. There is no value that fits my purposes, considering that some other images might even have a some more noise. - Find an implementation of the RDP algorithm that allows me to specify an EXACT number of output points. This would allow me to to compute the suggested polygons for a certain number of points (for example in the range 3..10) and then calculate
(A_1 A_2) / A_common - 1
to use the area instead of the arc length as an accuracy, which would probably lead to a better result. I have not yet found a good implementation for that. I will now try to use a numerical solver method to dynamically figure out the correct epsilon parameter for RDP. The approach is not really clean and efficient though. I will post the results here as soon as available. If someone has a better approach, please let me know.
CodePudding user response:
A possible approach would involve the calculation of some blob descriptors and filter blobs according to those properties. For example, you can compute the blob's aspect ratio, the (approximated) number of vertices and area. The steps are very straightforward:
- Load the image and convert it to grayscale.
- (Invert) Threshold the image. Let’s make sure the blobs are colored in white.
- Get the binary image’s contours.
- Compute two features: aspect ratio and number of vertices
- Filter the blobs based on those features
Let’s see the code:
# Imports:
import cv2
import numpy as np
# Load the image:
fileName = "yh6Uz.png"
path = "D://opencvImages//"
# Reading an image in default mode:
inputImage = cv2.imread(path fileName)
# Prepare a deep copy of the input for results:
inputImageCopy = inputImage.copy()
# Grayscale conversion:
grayscaleImage = cv2.cvtColor(inputImage, cv2.COLOR_BGR2GRAY)
# Threshold via Otsu:
_, binaryImage = cv2.threshold(grayscaleImage, 0, 255, cv2.THRESH_BINARY_INV cv2.THRESH_OTSU)
# Find the blobs on the binary image:
contours, hierarchy = cv2.findContours(binaryImage, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
# Store the bounding rectangles here:
circleData = []
squaresData = []
Alright. So far, I’ve loaded, thresholded and computed contours on the input image. Additionally, I’ve prepared two lists to store the bounding box of the squares and the circle. Let’s create the feature filter:
for i, c in enumerate(contours):
# Get blob perimeter:
currentPerimeter = cv2.arcLength(c, True)
# Approximate the contour to a polygon:
approx = cv2.approxPolyDP(c, 0.04 * currentPerimeter, True)
# Get polygon's number of vertices:
vertices = len(approx)
# Get the polygon's bounding rectangle:
(x, y, w, h) = cv2.boundingRect(approx)
# Compute bounding box area:
rectArea = w * h
# Compute blob aspect ratio:
aspectRatio = w / h
# Set default color for bounding box:
color = (0, 0, 255)
I loop through each contour and calculate the current blob’s perimeter
and polygon approximation
. This info is used to approximately compute the blob vertices
. The aspect ratio
calculation is very easy. I first get the blob’s bounding box and get its dimensions: top left corner (x, y)
, width
and height
. The aspect ratio is just the width divided by the height.
The squares and the circle a very compact. These means that their aspect ratio should be close to 1.0
. However, the squares have exactly 4
vertices, while the (approximated) circle has more. I use this info to build a very basic feature filter. It first checks aspect ratio
, area
and then number of vertices
. I use the difference between the ideal feature and the real feature. The parameter delta
adjusts the filter tolerance. Be sure to also filter tiny blobs, use the area for this:
# Set minimum tolerable difference between ideal
# feature and actual feature:
delta = 0.15
# Set the minimum area:
minArea = 400
# Check features, get blobs with aspect ratio
# close to 1.0 and area > min area:
if (abs(1.0 - aspectRatio) < delta) and (rectArea > minArea):
print("Got target blob.")
# If the blob has 4 vertices, it is a square:
if vertices == 4:
print("Target is square")
# Save bounding box info:
tempTuple = (x, y, w, h)
squaresData.append(tempTuple)
# Set green color:
color = (0, 255, 0)
# If the blob has more than 6 vertices, it is a circle:
elif vertices > 6:
print("Target is circle")
# Save bounding box info:
tempTuple = (x, y, w, h)
circleData.append(tempTuple)
# Set blue color:
color = (255, 0, 0)
# Draw bounding rect:
cv2.rectangle(inputImageCopy, (int(x), int(y)), (int(x w), int(y h)), color, 2)
cv2.imshow("Rectangles", inputImageCopy)
cv2.waitKey(0)
This is the result. The squares are identified with a green rectangle and the circle with a blue one. Additionally, the bounding boxes are stored in squaresData
and circleData
respectively: