Home > other >  Pass PIL Image to google cloud vision without saving and reading
Pass PIL Image to google cloud vision without saving and reading

Time:09-21

Is there a way to pass a PIL Image to google cloud vision?

I tried to use io.Bytes, io.String and Image.tobytes() but I always get:

Traceback (most recent call last):
  "C:\Users\...\vision_api.py", line 20, in get_text
    image = vision.Image(content)
  File "C:\...\venv\lib\site-packages\proto\message.py", line 494, in __init__
    raise TypeError(
TypeError: Invalid constructor input for Image:b'Ma\x81Ma\x81La\x81Ma\x81Ma\x81Ma\x81Ma\x81Ma\x81Ma\x81Ma\x81Ma\x81La\x81Ma\x81Ma\x81Ma\x81Ma\x80Ma\x81La\x81Ma\x81Ma\x81Ma\x80Ma\x81Ma\x81Ma\x81Ma\x8 ...

or this if I pass the PIL-Image directly:

TypeError: Invalid constructor input for Image: <PIL.Image.Image image mode=RGB size=480x300 at 0x1D707131DC0>

This is my code:

image = Image.open(path).convert('RGB')   # Opening the saved image
cropped_image = image.crop((30, 900, 510, 1200))   # Cropping the image

vision_image = vision.Image(# I passed the different options)   # Here I need to pass the image, but I don't know how
client = vision.ImageAnnotatorClient()
response = client.text_detection(image=vision_image)   # Text detection using google-vision-api

FOR CLARITY:

I want google text detection to only analyse a certain part of an image saved on my disk. So my idea was to crop the image using PIL and then pass the cropped image to google-vision. But it is not possible to pass an PIL-Image to vision.Image, as I get the error above.

The documentation from Google.

This can be found in the vision.Image class:

Attributes:
        content (bytes):
            Image content, represented as a stream of bytes. Note: As
            with all ``bytes`` fields, protobuffers use a pure binary
            representation, whereas JSON representations use base64.

            Currently, this field only works for BatchAnnotateImages
            requests. It does not work for AsyncBatchAnnotateImages
            requests.

A working option is to save the PIL-Image as a PNG/JPG on my disk and load it using:

with io.open(file_name, 'rb') as image_file:
    content = image_file.read()

vision_image = vision.Image(content=content)

But this is slow and seems unnecessary. And the whole point for me behind using google-vision-api is the speed comaped to open-cv.

CodePudding user response:

It would be good to have whole error stack and more accurate code snippet. But form presented information this seems to be confusion of two different "Images". Probably the some copy/paste error, as the tutorials have exactly the same line:

response = client.text_detection(image=image)

But mentioned tutorials image is created by vision.Image() so I think in presented code this should be:

response = client.text_detection(image=vision_image)

As, at least if I understand correctly the code snippet, image is PIL Image, while vision_image is Vision Image that should be passed to text_detection method. So whatever is done in vision.Image() does not have effect on the error massage.

CodePudding user response:

As far as I can tell, you start off with a PIL Image and you want to obtain a PNG image in memory without going to disk. So you need this:

#!/usr/bin/env python3

from PIL import Image
from io import BytesIO

# Create PIL Image like you have - filled with red
im = Image.new('RGB', (320,240), (255,0,0))

# Create in-memory PNG - like you want for Google Cloud Vision
buffer = BytesIO()
im.save(buffer, format="PNG")

# Look at first few bytes
PNG = buffer.getvalue()
print(PNG[:20])

It prints this, which is exactly what you would get if you wrote the image to disk as a PNG and then read it back as binary - except this does it in memory without going to disk:

b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x01@'
  • Related