Home > other >  Acheiving horizontal scaling through federation?
Acheiving horizontal scaling through federation?

Time:08-20

Does federation lead to horizontal scaling? We know that relational databases usually are scaled vertically as horizontal scaling is not feasible and can disrupt the entire databases. One of the main reasons why it is not possible as we tend to have join's in the queries and hence all the tables need to be present in the same server. However I came across the topic federation under relational database, it states that federation is the process of splitting a database by functions. For example, instead of a single, monolithic database, you could have three databases: forums, users, and products, resulting in less read and write traffic to each database and therefore less replication lag.

So my question is "Is federation a means of achieving horizontal scaling in RDMS"?

CodePudding user response:

Federation as you describe would split data over more servers, but probably not evenly. That is, in your example of forums, users, and products, I expect there are more users than products, and more forum messages than users. If federation is your only means of horizontal scaling, then it's only a matter of time (very short time) before one of those exceeds the capacity of its server.

Horizontal scaling by sharding is more common, because it gives you the opportunity to split data in a manner that is closer to equal portions per server.

You correctly point out that joins make sharding pretty hard.

You can still join on a single database server if the rows that are related reside on the same server. For example, if you had a table orders and a table lineitems that has a one-to-many relationship with orders, you could colocate the lineitems on the same server with its parent order record.

In a relational database, the relationships will get more complex very quickly. In your users and forums example, it's a many-to-many relationship. A given forum has messages from many users, and a given user may post messages on many forums. So how do you ensure the related rows are colocated? You can't.

So some types of relationships (notably many-to-many relationships) can't be sharded in a way that allows joins on the same server.

You can then choose one of these difficult options:

  • Bear with the fact that some joins must be resolved by fan-out queries that fetch rows from multiple shards, and resolve the join in the application.

  • Prohibit many-to-many relationships in your data model.

  • Related