I need to extract the bounding box of text and save it as sub-images of the main image. I am not getting the right code documentation for this task.
Please can anyone provide me code documentation or help links or any python modules which can help to crop text from scanned images.
Below I have attached a scanned image and expected output.
below image scanned copy need to crop text from image.
import cv2
import pytesseract
pytesseract.pytesseract.tesseract_cmd ='C:\\Program Files (x86)\\Tesseract-OCR\\tesseract'
img = cv2.imread("test.jpg")
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
ret, thresh1 = cv2.threshold(gray, 0, 255, cv2.THRESH_OTSU | cv2.THRESH_BINARY_INV)
rect_kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (18, 18))
dilation = cv2.dilate(thresh1, rect_kernel, iterations = 1)
contours, hierarchy = cv2.findContours(dilation, cv2.RETR_EXTERNAL,
cv2.CHAIN_APPROX_NONE)
im2 = img.copy()
file = open("recognized.txt", "w ")
file.write("")
file.close()
for cnt in contours:
x, y, w, h = cv2.boundingRect(cnt)
rect = cv2.rectangle(im2, (x, y), (x w, y h), (0, 255, 0), 2)
cropped = im2[y:y h, x:x w]
file = open("recognized.txt", "a")
text = pytesseract.image_to_string(cropped)
file.write(text)
file.write("\n")
crop_img = img[y:y h, x:x w] # just the region you are interested
file.close
second image expected croped image:
CodePudding user response:
In OpenCV you can use cv2.findContours
to draw the bounding boxes. See this article which explains how to do that: https://www.geeksforgeeks.org/text-detection-and-extraction-using-opencv-and-ocr/
Then after you have your bounding box locations (your region of interest where text is located, and you want to crop) you can use use slicing
to crop the image:
import cv2
img = cv2.imread("lenna.png")
crop_img = img[y:y h, x:x w] # just the region you are interested
cv2.imshow("cropped", crop_img)
cv2.waitKey(0)
If you want to extract the text directly, I think you can use tesseract ocr
a python package (How to get started: https://pypi.org/project/pytesseract/) . You can also make use of OpenCV built in OCR functions. Read more: https://nanonets.com/blog/ocr-with-tesseract/
CodePudding user response:
from PIL import image
original_image = Image.open(".nameofimage.jpg")
rotate_image = Original_image.rotate(330)
rotate_image.show()
x = 100
y = 80
h = 200
w = 200
cropped_image = rotate_image[y:y h, x:x w]
cropped_image.show()