Home > Software engineering >  zlib error code -3 while using zlib to decompress PDF Flatedecode stream
zlib error code -3 while using zlib to decompress PDF Flatedecode stream

Time:12-19

I am trying to extract some information from a PDF file. There is a 12 character stream that is compressed with Flatedecode that I've been unable to decompress although other streams in the document are readily decompressed with the same python 3.9 program.

This is extracted from a US Government - FAA Instrument procedures plate) PDF document that opens without an issue in Adobe acrobat.

The Excellent RUPS program for investigating PDFs which is written by the author of iText also appears to have difficulty decoding this stream as it shows only a single character from the 12-byte stream.

import zlib

hexDigits = "78 9c e3 2a e4 e5 02 20 01 a3 20 93"
stripWhitespace = hexDigits.replace(" ", "")

myByteArray = bytearray.fromhex(stripWhitespace)
data = zlib.decompress(myByteArray) # Here I get Error -3 while decompressing data: incorrect data check
print(data)

CodePudding user response:

You might be extracting or decoding the flate data incorrectly. There seem to be spaces where there should be nulls. If I change both 20's to 00, then the zlib stream is valid.

  • Related