I have managed to cropped a bounding box with text, e.g. given this image:
I'm able to exact the following box:
with this code:
import re
import shutil
from IPython.display import Image
import requests
import pytesseract, cv2
"""https://www.geeksforgeeks.org/text-detection-and-extraction-using-opencv-and-ocr/"""
# Preprocessing the image starts
# Convert the image to gray scale
img = cv2.imread('img.png')
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
# Performing OTSU threshold
ret, thresh1 = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)
# Specify structure shape and kernel size.
# Kernel size increases or decreases the area
# of the rectangle to be detected.
# A smaller value like (10, 10) will detect
# each word instead of a sentence.
rect_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (18, 18))
# Applying dilation on the threshold image
dilation = cv2.dilate(thresh1, rect_kernel, iterations = 1)
# Finding contours
contours, hierarchy = cv2.findContours(dilation, cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_NONE)
# Creating a copy of image
im2 = img.copy()
for cnt in contours:
x, y, w, h = cv2.boundingRect(cnt)
# Drawing a rectangle on copied image
rect = cv2.rectangle(im2, (x, y), (x w, y h), (0, 255, 0), 2)
# Cropping the text block for giving input to OCR
cropped = im2[y:y h, x:x w]
cv2.imwrite('image-notxt.png', cropped)
Image(filename='image-notxt.png', width=200)
Part 1: How do I replace the cropped box and put back a clear text box? e.g. to get something like:
I've tried:
for cnt in contours:
x, y, w, h = cv2.boundingRect(cnt)
# Drawing a rectangle on copied image
rect = cv2.rectangle(im2, (x, y), (x w, y h), (0, 255, 0), 2)
# Cropping the text block for giving input to OCR
cropped = im2[y:y h, x:x w]
text = pytesseract.image_to_string(cropped).strip('\x0c').strip()
text = re.sub(' ', ' ', text.replace('\n', ' ')).strip()
if text:
# White out the cropped box.
cropped.fill(255)
# Create the image with the translation.
cv2.putText(img=cropped, text="foobar", org=(12, 15), fontFace=cv2.FONT_HERSHEY_TRIPLEX, fontScale=0.3, color=(0, 0, 0),thickness=1)
cv2.imwrite('image-notxt.png', cropped)
Image(filename='image-notxt.png', width=200)
That managed to white out the cropped box and insert the text like this:
Part 2: How to create an opencv textbox rectangle with the same size as the cropped box? e.g. given a string foobar
, how to get the final image like this:
CodePudding user response:
In Python/OpenCV/Numpy, use Numpy to write a color to the area in the format:
img[y:y h, x:x w] = color tuple
For example:
img[40:40 45, 40:40 150] = (255,255,255)
where x,y,w,h = 40,40,150,45
To add text, see cv2.putText() at
Part 2: How to create an opencv textbox rectangle with the same size as the cropped box?
Putting in the text, it's a little more nuance, but the steps are first:
- Create the image with the text in it using
cv2.putText()
- But there are multiple things that
- length and font of the text that you want to put in and if they fit in the box
- location/position to put the text in the box
TL;DR
for i, chunk in enumerate(textwrap.wrap(translation, width=20)):
cv2.putText(img=cropped, text=chunk, org=(12, 15 i*10),
fontFace=cv2.FONT_HERSHEY_TRIPLEX, fontScale=0.3,
color=(0, 0, 0),thickness=1)
im2[y:y h, x:x w] = cropped
To handle the length of the text, I've to use the Python textwrap library to break the string into multiple substrings
Then iterating through the substrings, I putText
each of the substring into the cropped
image.
Finally, replace the portion of the original image with the edited cropped image with the text putted into it like im2[y:y h, x:x w] = cropped
A working example can be found on https://www.kaggle.com/code/alvations/image-translate