Python OCR on a noisy and rotated image-CodePudding

I've been working on a python project that can recognize codes in pictures but I've run into trouble when the code is sideways and in such a nosy background nothing I've tried has been close to working so I'm shooting in the dark. Some insight into how to solve this problem would be appreciated. I've tried anything from tesseract to Keras-ocr but I only get back random numbers and letters.

import pytesseract
import PIL.Image

import cv2


my_config = r"--psm 11 --oem 3"

text = pytesseract.image_to_string(PIL.Image.open(path_to_image2), config = my_config)
print(text)

Ive also tried lots of stuff with cv2 like creating masks but the masks arent dynamic enough to filter out the background

Here are two examples that I've been dealing with Thank you to anyone who tries to help me out!

CodePudding user response：

I hope this will help for your case. I had tried the same exact scenario with easyocr.

import easyocr
import cv2
import numpy as np
from scipy import ndimage


def image_sharpen(image, kernel_size=(5, 5), sigma=1.0, amount=1.0, threshold=0):
    blurred = cv2.GaussianBlur(image, kernel_size, sigma)
    sharpened = float(amount   1) * image - float(amount) * blurred
    sharpened = np.maximum(sharpened, np.zeros(sharpened.shape))
    sharpened = np.minimum(sharpened, 255 * np.ones(sharpened.shape))
    sharpened = sharpened.round().astype(np.uint8)
    if threshold > 0:
        low_contrast_mask = np.absolute(image - blurred) < threshold
        np.copyto(sharpened, image, where=low_contrast_mask)
    return sharpened

def reduce_brightness(image, gamma=1.0):
    invGamma = 1.0 / gamma
    table = np.array([((i / 255.0) ** invGamma) * 255
                      for i in np.arange(0, 256)]).astype("uint8")
    return cv2.LUT(image, table)

gamma = 0.35
rotation_angle = 155 # 25
image = cv2.imread('pze5c.png') # yCVet.jpg
rotate_image = ndimage.rotate(image,rotation_angle) 
sharpened = image_sharpen(rotate_image)
adjusted = reduce_brightness(sharpened, gamma=gamma)

cv2.imwrite('resize.png', adjusted)
# cv2.imshow('',adjusted)
# cv2.waitKey(0)
reader = easyocr.Reader(['en'], gpu=False)
result = reader.readtext('resize.png')
for detection in result:
    print(detection)

The output what i got is

([[37, 394], [994, 394], [994, 505], [37, 505]], '9 Q 9 E - C R H L G F  D ] B', 0.13722358295856807)

Instead of J it recognized ]

for the other image i change the rotation angle as 25. The output is,

([[78, 565], [1515, 565], [1515, 678], [78, 678]], "Z G P H ' $ B X Z B 2 - 5 M B *", 0.3908300967267578)

Instead of S it picked $. Feel free to play around with gamma, rotational angle and other. For higher performance, you go with google vision but it is non-opensource.