I am using pytesseract for ocr and it works fine for jpg,jpeg and some png files but crashes on selected png files which are mobile screenshots Here is my code:
img = cv2.imread('test.png',cv2.COLOR_BGR2GRAY)
custom_config = r'--oem 3 --psm 6'
data=pytesseract.image_to_string(img, config=custom_config)
print(data)
The error generated is:
Traceback (most recent call last):
File "/home/hkc/Documents/work/opencv/cv/lib/python3.10/site-packages/PIL/Image.py",
line 2992, in fromarray
mode, rawmode = _fromarray_typemap[typekey]
KeyError: ((1, 1, 3), '<u2')
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
File "/home/hkc/Documents/work/opencv/test.py", line 13, in <module>
data=pytesseract.image_to_string(img, config=custom_config)
File "/home/hkc/Documents/work/opencv/cv/lib/python3.10/site-
packages/pytesseract/pytesseract.py", line 423, in image_to_string
return {
File "/home/hkc/Documents/work/opencv/cv/lib/python3.10/site-
packages/pytesseract/pytesseract.py", line 426, in <lambda>
Output.STRING: lambda: run_and_get_output(*args),
File "/home/hkc/Documents/work/opencv/cv/lib/python3.10/site-
packages/pytesseract/pytesseract.py", line 277, in run_and_get_output
with save(image) as (temp_name, input_filename):
File "/usr/lib/python3.10/contextlib.py", line 135, in __enter__
return next(self.gen)
File "/home/hkc/Documents/work/opencv/cv/lib/python3.10/site-
packages/pytesseract/pytesseract.py", line 197, in save
image, extension = prepare(image)
File "/home/hkc/Documents/work/opencv/cv/lib/python3.10/site-
packages/pytesseract/pytesseract.py", line 171, in prepare
image = Image.fromarray(image)
File "/home/hkc/Documents/work/opencv/cv/lib/python3.10/site-packages/PIL/Image.py",
line 2994, in fromarray
raise TypeError("Cannot handle this data type: %s, %s" % typekey) from e
TypeError: Cannot handle this data type: (1, 1, 3), <u2
CodePudding user response:
You can't use cv2.COLOR_BGR2GRAY
with cv2.imread()
because all the ones starting with cv2.COLOR_XXX
are for use with cv2.cvtColor()
.
You need to use the ones starting with cv2.IMREAD_XXX
with cv2.imread()
.
So, I guess you want:
img = cv2.imread('test.png',cv2.IMREAD_GRAYSCALE)