Home > database >  Original ZIP size limits 2^32 vs 2^32 - 1. Is there an off by 1 error in Wikipedia?
Original ZIP size limits 2^32 vs 2^32 - 1. Is there an off by 1 error in Wikipedia?

Time:01-03

According to https://en.wikipedia.org/wiki/ZIP_(file_format)#ZIP64

The original .ZIP format had a 4 GB (2^32 bytes) limit on various things (uncompressed size of a file, compressed size of a file, and total size of the archive), as well as a limit of 65,535 (2^16-1) entries in a ZIP archive.

Is the 2^32 value correct? By my understanding, the maximum value should be the maximum possible value held in a 32 bit unsigned integer, which is 2^32-1

I know that 2^32-1 does have particular meaning according to the ZIP spec at https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT (usually mentioned as 0xFFFFFFFF), so I don't want to assume anything.

The 2^16-1 limit for number of entries does seem right to me, as the maximum value that can be stored in a 16 bit unsigned integer.

Context: I'm writing code to write ZIP files in a streaming way in Python https://github.com/uktrade/stream-zip, as well code to open ZIP files in a streaming way https://github.com/uktrade/stream-unzip, and I want both to handle the various limits correctly. Or if not "correctly" (say if there is no "correctly") as best as is reasonable.

CodePudding user response:

They mix up a few things in that sentence, but the limits were 232-1 compressed bytes as well as 232-1 uncompressed bytes in a single entry, and a start-of-central-directory offset of 232-1. And, as stated, 216-1 entries.

Note that the limit on the central-directory offset permits a zip file larger than 4GB, but not much larger. So the "total size of the archive" limit mentioned in the Wikipedia page is neither 4GB nor 4GB-1. The sentence would need to be broken up to provide exactly correct information.

CodePudding user response:

Elsewhere on the same Wikipedia page includes the minus 1: https://en.wikipedia.org/wiki/ZIP_(file_format)#Limits

The maximum size for both the archive file and the individual files inside it is 4,294,967,295 bytes (2^32−1 bytes, or 4 GB minus 1 byte) for standard ZIP.

Also in the ZIP specification at https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT, there are various mentions of the size fields being 4 bytes, e.g.

4.4.8 compressed size: (4 bytes)
4.4.9 uncompressed size: (4 bytes)

and the largest values that can be stored in these are 2^32−1

And in the same spec, it says:

Maximum .ZIP segment size = 4,294,967,295 bytes 

which is 2^32-1 (I guess applicable for single-segment ZIP files?)

So I think yes, there is an off by one error in the ZIP64 section, and the maximum size is 2^32 - 1

  • Related