- What binary encodings are used by databases like SQLite 3, MySQL, PostgreSQL, etc to persist data on disk?
- If Google Protobuf is good, why not use that to encode and store data on disk?
- Are Google Protobuf / Facebook Thrift only helpful in network communication like microservice to another microservice?
- Why do we not use these third party binary encoding libraries for storing data on disks?
[This blog] talks about how Protobufs beat JSON in response times. I am confused whether these binary encoding libraries like protobuf, Avro, Thrift are only used for network communications and not for storing data on disk?
CodePudding user response:
"good" is contextual; protobuf is "good" for x-plat general purpose data representation where that data needs to support round-trip of unexpected fields, etc; these are not the same problems that databases face. Databases usually use custom data/memory layouts based on row layouts - and the fastest way of serializing that is to just use the exact same layout on disk - perhaps using mmap; protobuf is not a good candidate for that, since it is variable size; databases usually want constant size (ish) so rows are deterministic within pages. Add to that transactionality etc...
Also: protobuf can be and is used for storing data on disk; just; not usually as part of an RDBMS.