This is just a part of the project I'm currently working on. I am trying to convert picture into text, then from text back to the picture without any loss or extra size. First, I open the picture, read the pixels, and write them down. Pictures are size NxN.
from PIL import Image
import sys
import zlib
def rgb_to_hex(rgb):
return 'xxx' % rgb
N = im.width
im = Image.open(r"path\\pic.png")
px = im.load()
read_pixels = ""
for i in range(N):
for j in range(N):
read_pixels = rgb_to_hex(px[j, i,])
Then, transform the string into bytes.
data = bytes.fromhex(read_pixels)
img = Image.frombytes("RGB", (N,N), data)
img.save("path\\new.png",quality = 92)
According to the Pillow official documentation they are saying that quality goes from 0 - 100 and values over 95 should be avoided. If there is nothing set, the default value is 75.
For example I used this picture. The original photo when downloaded takes up 917 KB. When the picture is converted by the program, the new picture takes up 911 KB. Then I take my new picture (911KB) and run that one by the same program and I get back the same size 911KB this one did not shrink by a few KB and I do not know why. Why does this weird interaction happen only when I put original picture of 917 KB? Is there a way I could get 100% of the original quality.
I also tried this on some random 512x512 .jpg picture. Original size of that picture is 67.4KB, next "generation" of that picture is 67.1KB and one after that is 66.8KB. Also if I change quality to 93 or above (when using .jpg) the size goes up by a lot (at quality = 100, size > 135KB). I was 'playing' around with quality value and found out closest to the same size is 92 (<93 puts some extra KB for .jpg).
So with quality 92 .PNG the size stays the same after the first "generation" but with .jpg the size (and potentially quality) goes down.
Is there something I am missing in my code? My best guess is that .PNG stores some extra information about the picture which is lost in the conversion, but not sure why the .jpg pictures decrease in size every generation. I tried putting 92.5 quality but the function does not accept decimal numbers as parameters.
CodePudding user response:
Quick takeaways from the following explanations...
- The
quality
parameter forPIL.Image.save
isn't used when saving PNGs. - JPEG is generationally-lossy so as you keep re-saving images, they will likely degrade in quality because the algorithm will introduce more artifacting (among other things)
- PNG is lossless and the file size differences you're seeing are due to
PIL
stripping metadata when you re-save your image.
Let's look at your PNG file first. PNG is a lossless format - the image data you give it will not suffer generational loss if you were to open it and re-save it as PNG over and over again.
The quality
parameter isn't even recognized by the PNG plugin to PIL - if you look at the PngImagePlugin.py/PngStream._save
method it is never referenced in there.
What's happening with your specific sample image is that Pillow is dropping some metadata when you re-save it in your code.
On my test system, I have your PNG saved as sample.png
, and I did a simple load-and-save with the following code and save it as output.png
(inside ipython
)
In [1]: from PIL import Image
In [2]: img = Image.open("sample.png")
In [3]: img.save("output.png")
Now let's look at the differences between their metadata with ImageMagick:
#> diff <(magick identify -verbose output.png) <(magick identify -verbose sample.png)
7c7,9
< Units: Undefined
---
> Resolution: 94.48x94.48
> Print size: 10.8383x10.8383
> Units: PixelsPerCentimeter
74c76,78
< Orientation: Undefined
---
> Orientation: TopLeft
> Profiles:
> Profile-exif: 5218 bytes
76,77c80,81
< date:create: 2022-08-12T21:27:13 00:00
< date:modify: 2022-08-12T21:27:13 00:00
---
> date:create: 2022-08-12T21:23:42 00:00
> date:modify: 2022-08-12T21:23:31 00:00
78a83,85
> exif:ImageDescription: IMGP5493_seamless_2.jpg
> exif:ImageLength: 1024
> exif:ImageWidth: 1024
84a92
> png:pHYs: x_res=9448, y_res=9448, units=1
85a94,95
> png:text: 1 tEXt/zTXt/iTXt chunks were found
> png:text-encoded profiles: 1 were found
86a97
> unknown: nomacs - Image Lounge 3.14
90c101
< Filesize: 933730B
---
> Filesize: 939469B
93c104
< Pixels per second: 42.9936MP
---
> Pixels per second: 43.7861MP
You can see there are metadata differences - PIL didn't retain some of the information when re-saving the image, especially some exif
properties (you can see this PNG was actually converted from a JPG and the EXIF metadata was preserved in the conversion).
However, if you re-save the image with original image's info
data...
In [1]: from PIL import Image
In [2]: img = Image.open("sample.png")
In [3]: img.save("output-with-info.png", info=img.info)
You'll see that the two files are exactly the same again:
❯ sha256sum output.png output-with-info.png
37ad78a7b7000c9430f40d63aa2f0afd2b59ffeeb93285b12bbba9c7c3dec4a2 output.png
37ad78a7b7000c9430f40d63aa2f0afd2b59ffeeb93285b12bbba9c7c3dec4a2 output-with-info.png
Maybe Reducing PNG File Size
While lossless, the PNG format does allow for reducing the size of the image by specifying how aggressive the compression is (there are also more advanced things you could do like specifying a compression dictionary).
PIL exposes these options as optimize
and compress_level
under PNG options.
optimize
If present and true, instructs the PNG writer to make the
output file as small as possible. This includes extra
processing in order to find optimal encoder settings.
compress_level
ZLIB compression level, a number between 0 and 9: 1 gives
best speed, 9 gives best compression, 0 gives no
compression at all. Default is 6. When optimize option is
True compress_level has no effect (it is set to 9 regardless
of a value passed).
And seeing it in action...
from PIL import Image
img = Image.open("sample.png")
img.save("optimized.png", optimize=True)
The resulting image I get is about 60K smaller than the original.
❯ ls -lh optimized.png sample.png
-rw-r--r-- 1 wkl staff 843K Aug 12 18:10 optimized.png
-rw-r--r-- 1 wkl staff 918K Aug 12 17:23 sample.png
JPEG File
Now, JPEG is a generationally-lossy image format - as you save it over and over, you will keep losing quality - it doesn't matter if your subsequent generations save it at even higher qualities than the previous ones, you've lost data already from the previous saves.
Note that the likely reason why you saw file sizes balloon if you used quality=100
is because libjpeg
/libjpeg-turbo
(which are the underlying libraries used by PIL for JPEG) do not do certain things when the quality is set that high, I think it doesn't do quantization which is an important step in determining how many bits are needed to compress.