Home > database >  Does swift suffer from 2gb max serialization size problem of Java?
Does swift suffer from 2gb max serialization size problem of Java?

Time:12-02

In Java and other JVM based languages there's often a need to serialize things into an array of bytes. For example, when you want to send something over the network you first serialize it into an array of bytes.

Do people do that in Swift? Or how is data usually serialized when you want to send it over the network?

The problem is that byte[] and other arrays are indexed using ints and when you create an array you also use an int, for example: byte[] a = new byte[your int goes here]. You can't create an array using a long (64 bit integer), therefore your max array size is limited by the maximum integer, which is 2,147,483,647 (in reality the max array size is a bit lower: 2,147,483,6475), so the biggest array of type byte[] can only store about 2gb of data.

When I use Spark (a distributed computing library), when I create a Dataset, each element has to occupy no more than 2gbs of RAM, because internally it gets serialized when sending data from one node of your cluster to another, and when I am working with huge objects I am forced to split 1 big object into small ones to avoid serialization exceptions.

I think C# and many other languages have the same issue. Is the 32-bit .NET max byte array size < 2GB?

Am I right when saying that Swift doesn't suffer from this issue, since arrays are indexed using Int (which is 64 bits on a 64 bit system), and byte arrays can be of size min(Int.max, maximum_number_available_by_the_os)?

CodePudding user response:

Yes, you are correct. Swift's Int type, the preferred type for integer bounds, is word-size, i.e., 64 bits on a 64-bit machine, and 32 bits on a 32-bit machine. This means that indexing into an Array, you can go well beyond the 2^31-1 limit.

And while idiomatically, higher-level types like Foundation's Data or NIOCore.ByteBuffer from swift-nio are typically preferred as "bag of bytes" types over [UInt8] (the Swift equivalent to byte[]), these types are also indexed using Int, and so are also not limited to 2GiB in size.

  • Related