Get the location of all contours present in image using opencv, but skipping text-CodePudding

I want to retrieve all contours of the image below, but ignore text.

Image:

When I try to find the contours of the current image I get the following:

I have no idea how to go about this as I am new to using OpenCV and image processing. I want to get ignore the text, how can I achieve this? If ignoring is not possible but making a single bounding box surrounding the text is, than that would be good too.

CodePudding user response：

Here is one way to do that in Python/OpenCV.

Read the input
Convert to grayscale
Get Canny edges
Apply morphology close to ensure they are closed
Get all contour hierarchy
Filter contours to keep only those above threshold in perimeter
Draw contours on input
Draw each contour on a black background
Save results

Input:

import numpy as np
import cv2

# read input
img = cv2.imread('short_title.png')

# convert to gray
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)

# get canny edges
edges = cv2.Canny(gray, 1, 50)

# apply morphology close to ensure they are closed
kernel = cv2.getStructuringElement(cv2.MORPH_ELLIPSE, (3,3))
edges = cv2.morphologyEx(edges, cv2.MORPH_CLOSE, kernel)

# get contours
contours = cv2.findContours(edges, cv2.RETR_TREE, cv2.CHAIN_APPROX_NONE)
contours = contours[0] if len(contours) == 2 else contours[1]

# filter contours to keep only large ones
result = img.copy()
i = 1
for c in contours:
    perimeter = cv2.arcLength(c, True)
    if perimeter > 500: 
        cv2.drawContours(result, c, -1, (0,0,255), 1)
        contour_img = np.zeros_like(img, dtype=np.uint8)
        cv2.drawContours(contour_img, c, -1, (0,0,255), 1)
        cv2.imwrite("short_title_contour_{0}.jpg".format(i),contour_img)
        i = i   1

# save results
cv2.imwrite("short_title_gray.jpg", gray)
cv2.imwrite("short_title_edges.jpg", edges)
cv2.imwrite("short_title_contours.jpg", result)

# show images
cv2.imshow("gray", gray)
cv2.imshow("edges", edges)
cv2.imshow("result", result)
cv2.waitKey(0)

Grayscale:

Edges:

All contours on input:

Contour 1:

Contour 2:

Contour 3:

Contour 4:

CodePudding user response：

I would recommend using flood fill, find the seed point for each color region, flood fill it to ignore the text values within. Hope that helps!

Refer to example of using floodfill here:

Finding white (and small) connected components:

Use mask = cv2.inRange(img, (230, 230, 230), (255, 255, 255)) for finding the text (assume the text is white).
Finding connected components in the mask using cv2.connectedComponentsWithStats(mask, 4)
Remove large components from the mask - fill components with large area with zeros.

Code sample:

import cv2
import numpy as np

img = cv2.imread('ShortAndInteresting.png')

mask = cv2.inRange(img, (230, 230, 230), (255, 255, 255))

nlabel, labels, stats, centroids = cv2.connectedComponentsWithStats(mask, 4)  # Finding connected components with statistics

# Remove large components from the mask (fill components with large area with zeros).
for i in range(1, nlabel):
    area = stats[i, cv2.CC_STAT_AREA]  # Get area
    if area > 1000:
        mask[labels == i] = 0  # Remove large connected components from the mask (fill with zero)

mask = cv2.dilate(mask, np.ones((5, 5), np.uint8))  # Dilate the text in the maks

cv2.imwrite('mask2.png', mask)

clean_img = cv2.inpaint(img, mask, 2, cv2.INPAINT_NS)  # Remove the text using inpaint (replace the masked pixels with the neighbor pixels).

# Show mask and clean_img for testing
cv2.imshow('mask', mask)
cv2.imshow('clean_img', clean_img)
cv2.waitKey()
cv2.destroyAllWindows()

Mask:

Clean image:

Note:

My assumption is that you know how to split the image into contours, and the only issue is the present of the text.