How to pivot / reshape this array in Python while still enabling it to work with PIL's Image.fr-CodePudding

Okay, I've been beating my head against the wall on this one and hoping someone can point out the dumb thing that I'm doing.

So the code here pulls out the ASCII value from the characters in a string of text and then pieces them together into RGB values to feed into PIL to output a PNG with the pixels. The code below works, but it only outputs a single column of pixels and ideally I'd be able to restructure the output to have multiple rows. I've tried np.reshape() and that ended up stripping out the structure of three integers in an array of arrays.

I've got a loop to produce a finalArray, but when I run that - I get this error from PIL:

Exception has occurred: TypeError Cannot handle this data type: (1, 1, 1, 3), |u1 During handling of the above exception, another exception occurred: new_image = Image.fromarray(array)

Here's my code - certainly not the best-written code ever, would very much appreciate any guidance or advice someone can provide. Thank you!

from PIL import Image
import numpy as np

strInput = 'Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed laoreet egestas urna, sit amet laoreet risus faucibus at. Integer efficitur nulla lacus, id egestas massa tempus nec.'
masterArray = []
i = 0

for c in list(strInput):
    ascii_value = ord(c)
    i  = 1
    
    if i % 3 == 1:
        first_val = ascii_value
    if i % 3 == 2:
        second_val = ascii_value
    if i % 3 == 0:
        third_val = ascii_value
        x = [(first_val, second_val, third_val)]
        masterArray.append(x)
    
tmpArray = []
finalArray = []

i = 0

for a in masterArray:
    i  = 1
    tmpArray.append(a)
    if i % 4 == 0:
        finalArray.append(np.array(tmpArray, dtype=np.uint8))
        #clear the buffer
        tmpArray = []
        
array = np.array(masterArray, dtype=np.uint8)
#array = np.array(finalArray, dtype=np.uint8)

new_image = Image.fromarray(array)

new_image.save('new_2.png')

CodePudding user response：

The way you construct finalArray makes it a 4D array, which is not a structure used for 2D RGB images. 3 axes suffice; one for rows, one for columns and one for the colour channels.

Let's say you get all the pixels in a flat row, from left to right, from top to bottom. The numbers would then be the RGB values for the (0, 0) pixel, RGB values for the (0, 1) pixel, ..., RGB values for the (0, width) pixel, RGB values for the (1, 0) pixel, etc. You can simply do that with

values = np.array(list(map(ord, strInput)), dtype=np.uint8)

From here if you reshape it so that you have triplets of values, you'll derive masterArray. Of course if values is not divisible by 3, you need to trim it.

if tail := len(values) % 3:
    values = values[:-tail]
master_array = values.reshape((-1, 3))

This gives you 59 RGB pixels. You could have done the same to get a 2D image by reshaping it like values.reshape((rows, columns, 3)), but for that to work, you need a number that behaves well, so more trimming (or padding with some value). 59 is a prime number so it can only be decomposed to 1x59.

Example with padding

rows = 4
values = np.array(list(map(ord, strInput)), dtype=np.uint8)
# Make it a round 180 values, so we can have 60 pixels for a 4x15 image
if remainder := len(values) % (rows * 3):
    values = np.concatenate([values, np.zeros((rows * 3 - remainder,), dtype=np.uint8)])
img = Image.fromarray(values.reshape((4, -1, 3)))
img.save('out.png')

CodePudding user response：

If what you want is to interpret the text as RGB pixels, you don't need the Numpy step at all; Image.frombytes() is enough, as long as the width/height are correct. (If you wanted a square image, you could get fancy with math.sqrt().)

from PIL import Image

text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Sed laoreet egestas urna, sit amet laoreet risus faucibus at. Integer efficitur nulla lacus, id egestas massa tempus nec."
data = text.encode("utf-8")
data  = b"\0" * (3 - len(data) % 3)  # Pad to RGB pixels
width = 10
height = int(len(data) / 3) // width  # Compute height based on width
img = Image.frombytes("RGB", (width, height), data)
print(img)

This prints out e.g.

<PIL.Image.Image image mode=RGB size=10x6>

and the image (severely zoomed-in) looks like

Saving the image in a suitably uncompressed format lets you see that indeed, your text is in there (though BMPs, such as here, store the data upside-down...)

~ $ xxd output.bmp
00000000: 424d f600 0000 0000 0000 3600 0000 2800  BM........6...(.
00000010: 0000 0a00 0000 0600 0000 0100 1800 0000  ................
00000020: 0000 c000 0000 c40e 0000 c40e 0000 0000  ................
00000030: 0000 0000 0000 2064 6965 6765 6174 736d  ...... diegeatsm
00000040: 2073 7373 6174 2061 706d 6520 7375 6365   sssat apme suce
00000050: 6e00 002e 0000 6574 6e72 6567 6665 2063  n.....etnregfe c
00000060: 6966 7574 696e 2072 6c6c 756c 2061 7563  ifutin rllul auc
00000070: 6120 2c73 0000 6c20 7472 6f61 7465 6569  a ,s..l troateei
00000080: 7220 7375 7361 6620 6963 7573 7562 7461  r susaf icusubta
00000090: 2049 202e 0000 616c 2065 726f 2074 6565   I ...al ero tee
000000a0: 6765 6174 7375 2073 616e 7273 202c 2074  geatsu sanrs , t
000000b0: 6965 6d61 0000 6573 6e65 7463 7275 7464  iema..esnetcrutd
000000c0: 6120 6970 6969 6373 2067 6e69 6c65 202e  a ipiics gnile .
000000d0: 7464 6553 0000 726f 4c20 6d65 7370 6920  tdeS..roL mespi
000000e0: 6d75 6c6f 6420 726f 7469 736d 6120 2c74  mulod rotisma ,t
000000f0: 656f 6320 0000                           eoc ..