Home > Mobile >  Do databases use their own binary encodings to store data on disk? Why not use Google Protobuf?
Do databases use their own binary encodings to store data on disk? Why not use Google Protobuf?

Time:05-13

  1. What binary encodings are used by databases like SQLite 3, MySQL, PostgreSQL, etc to persist data on disk?
  2. If Google Protobuf is good, why not use that to encode and store data on disk?
  3. Are Google Protobuf / Facebook Thrift only helpful in network communication like microservice to another microservice?
  4. Why do we not use these third party binary encoding libraries for storing data on disks?

[This blog] talks about how Protobufs beat JSON in response times. I am confused whether these binary encoding libraries like protobuf, Avro, Thrift are only used for network communications and not for storing data on disk?

CodePudding user response:

"good" is contextual; protobuf is "good" for x-plat general purpose data representation where that data needs to support round-trip of unexpected fields, etc; these are not the same problems that databases face. Databases usually use custom data/memory layouts based on row layouts - and the fastest way of serializing that is to just use the exact same layout on disk - perhaps using mmap; protobuf is not a good candidate for that, since it is variable size; databases usually want constant size (ish) so rows are deterministic within pages. Add to that transactionality etc...

Also: protobuf can be and is used for storing data on disk; just; not usually as part of an RDBMS.

  • Related