Which distributed SQL databases put data from different tables with the same tenant in the same node-CodePudding

I’m designing a SQL data schema with many tables that have a compound primary key of (customer_id, id). The application will frequently need to run JOIN queries to assemble data. However, it should never run cross customer joins. (Strictly disallowing cross customer joins could be a useful security feature, but business requirements might change.)

I’m looking at distributed SQL databases. Ideally I’d like to know that data across tables for the same customer_id lives on the same shard so joins don’t need to be executed cross-shard.

I’ve read through the documentation of Vitess, YugabyteDB, and CockroachDB. Of the three I only feel confident that Vitess will store data from the same customer on the same node.

Am I missing a feature of YugabyteDB/CockroachDB?

CodePudding user response：

CockroachDB does not support collocated tables as requested here. This was previously supported in the form of interleaved tables but was deprecated recently as the implementation was hard to maintain and did not yield significant performance wins as expected. The performance aspects are discussed at length on this GH thread https://github.com/cockroachdb/cockroach/issues/52009 if you are interested in learning more.

CodePudding user response：

there are two aspects in that: security and performance. For security (tenant isolation) you don't need to dedicate nodes if you have Row Level Security. Here is an example on YugabyteDB: https://dev.to/yugabyte/hyper-scale-multi-tenant-for-saas-an-example-with-pgbench-33a3 For performance (avoid cross-node transactions) you can use declarative partitioning on top of it. The partitions, like created in the blog post, can be tagged to specific nodes through cloud/region/zone topology