I am trying to read numbers from an image with 20x10 resolution. I know this question might be a duplicate. I've gone through most of the questions here on stack overflow but none of the answers seems to work for me.
Here is the image I am trying to read text from:
Here is the my current code:
import pytesseract as pt
from PIL import Image
pt.pytesseract.tesseract_cmd = r"C:\Program Files\Tesseract-OCR\tesseract.exe"
img = Image.open('foo.PNG')
text = pt.image_to_string(img)
print(text)
?
I am new to pytesseract
and image processing. Any suggestion or help will be greatly appreciated.
CodePudding user response:
Actually, I have to say that tesseract
is very touchy to play with. According to my experiences, I can easily say that if you -as a human- are not able to read a text clearly, you shouldn't expect tesseract
to read it either.
First of all; to get better results, it is a must to make a good preprocessing. I strongly recommend anyone dealing with tesseract to check their
Now even if DPI satisfies, now you are losing the accuracy and getting noises.
Note: It also doesn't mean that higher resolution means better results. Please check here.
Note: If you really need to continue on these ytpes of images, you may need to have a look at here. First you get higher resolution and then deblurring operation, this may help to figure it out.