According to https://en.wikipedia.org/wiki/ZIP_(file_format)#ZIP64
The original .ZIP format had a 4 GB (2^32 bytes) limit on various things (uncompressed size of a file, compressed size of a file, and total size of the archive), as well as a limit of 65,535 (2^16-1) entries in a ZIP archive.
Is the 2^32 value correct? By my understanding, the maximum value should be the maximum possible value held in a 32 bit unsigned integer, which is 2^32-1
I know that 2^32-1 does have particular meaning according to the ZIP spec at https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT (usually mentioned as 0xFFFFFFFF), so I don't want to assume anything.
The 2^16-1 limit for number of entries does seem right to me, as the maximum value that can be stored in a 16 bit unsigned integer.
Context: I'm writing code to write ZIP files in a streaming way in Python https://github.com/uktrade/stream-zip, as well code to open ZIP files in a streaming way https://github.com/uktrade/stream-unzip, and I want both to handle the various limits correctly. Or if not "correctly" (say if there is no "correctly") as best as is reasonable.
CodePudding user response:
They mix up a few things in that sentence, but the limits were 232-1 compressed bytes as well as 232-1 uncompressed bytes in a single entry, and a start-of-central-directory offset of 232-1. And, as stated, 216-1 entries.
Note that the limit on the central-directory offset permits a zip file larger than 4GB, but not much larger. So the "total size of the archive" limit mentioned in the Wikipedia page is neither 4GB nor 4GB-1. The sentence would need to be broken up to provide exactly correct information.
CodePudding user response:
Elsewhere on the same Wikipedia page includes the minus 1: https://en.wikipedia.org/wiki/ZIP_(file_format)#Limits
The maximum size for both the archive file and the individual files inside it is 4,294,967,295 bytes (2^32−1 bytes, or 4 GB minus 1 byte) for standard ZIP.
Also in the ZIP specification at https://pkware.cachefly.net/webdocs/casestudies/APPNOTE.TXT, there are various mentions of the size fields being 4 bytes, e.g.
4.4.8 compressed size: (4 bytes)
4.4.9 uncompressed size: (4 bytes)
and the largest values that can be stored in these are 2^32−1
And in the same spec, it says:
Maximum .ZIP segment size = 4,294,967,295 bytes
which is 2^32-1 (I guess applicable for single-segment ZIP files?)
So I think yes, there is an off by one error in the ZIP64 section, and the maximum size is 2^32 - 1