Home > database >  Pytesseract not working for low resolution images
Pytesseract not working for low resolution images

Time:10-27

I am trying to read numbers from an image with 20x10 resolution. I know this question might be a duplicate. I've gone through most of the questions here on stack overflow but none of the answers seems to work for me. Here is the image I am trying to read text from:
enter image description here

Here is the my current code:

import pytesseract as pt
from PIL import Image


pt.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"

img = Image.open('foo.PNG')
text = pt.image_to_string(img)
print(text)
?

I am new to pytesseract and image processing. Any suggestion or help will be greatly appreciated.

CodePudding user response:

Actually, I have to say that tesseract is very touchy to play with. According to my experiences, I can easily say that if you -as a human- are not able to read a text clearly, you shouldn't expect tesseract to read it either.

First of all; to get better results, it is a must to make a good preprocessing. I strongly recommend anyone dealing with tesseract to check their enter image description here

Now even if DPI satisfies, now you are losing the accuracy and getting noises.

Note: It also doesn't mean that higher resolution means better results. Please check here.

Note: If you really need to continue on these ytpes of images, you may need to have a look at here. First you get higher resolution and then deblurring operation, this may help to figure it out.

  • Related