Home > Software design >  How to detect text box and its coordinates in Python?
How to detect text box and its coordinates in Python?

Time:10-17

I have the following image:

initial image

And essentially, I would like to get the following result while also detecting the text:

desired result

My current approach

I am using enter image description here

The text is Bodego IV (correctly identified by your code).


Update:

Masking the entire dark background:

For masking the entire background, we may use the following stages (after finding top_left and bottom_right):

  • Crop a ROI with 10% margins from the bounding rectangle we found earlier.
  • Find median of Blue, Green and Red color channels.
    Assume the dark background color is almost solid, and the median is close to the color of the background.
  • Build a mask of pixels value close to the median color.
  • Find contours in the mask.
  • Find contour with the maximum area (filtering noise).
  • Find bounding rectangle of the largest contour.
  • Fill the bounding rectangle with black color.

Code sample:

import numpy as np
import cv2

img = cv2.imread('00025.jpg')

top_left = np.array([100, 0], np.int32)  # top_left we found earlier
bottom_right = np.array([562, 114], np.int32)  # bottom_right we found earlier

cols = bottom_right[0] - top_left[0]  # ROI width
rows = bottom_right[1] - top_left[1]  # ROI height

prct10 = np.array([cols//10, rows//10], np.int32)  # About 10% of width and 10% of height.

top_left = np.maximum(top_left - prct10, 0)  # Subtract 10% from top left coordinate, and clip to [0, 0]
bottom_right = np.minimum(bottom_right   prct10, np.array(img.shape)[1::-1]-1)  # Add 10% to bottom right coordinate and clip to [img.shape[1]-1, img.shape[0]-1]

roi = img[top_left[1]:bottom_right[1], top_left[0]:bottom_right[0], :]  # Crop the relevant ROI (with 10% margins from each side).

# Compute the median of B,G,R of ROI - supposed to be the BGR color of the solid background.
# Note: Due to JPEG compression, the background is not completely solid.
med = np.round(np.median(roi, axis=(0,1))).astype(np.int32)

mask = cv2.inRange(roi, np.maximum(med-5, 0), np.minimum(med 5, 255))  # Build a mask of pixels value close to the median color.

# Find contours in the mask
cnts = cv2.findContours(mask, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_NONE)[0]

# Find contour with the maximum area (filtering noise).
c = max(cnts, key=cv2.contourArea)

rect = cv2.boundingRect(c)  # Find bounding rectangle.

cv2.rectangle(roi, rect, (0, 0, 0), -1)

#img[top_left[1]:bottom_right[1], top_left[0]:bottom_right[0], :] = 0  # Fill the area with zeros.
cv2.imshow('mask', mask)  # Show mask (for testing).
cv2.imshow('img', img)  # Show image (for testing).
cv2.waitKey()
cv2.destroyAllWindows()

Mask:
enter image description here

Output:
enter image description here

  • Related