It is considered the best approach that a microservice db is private to it and shared nothing approach is the best.
I recently came across a cloud based system which used postgres, mongodb & elasticsearch. The per service DB is followed, to the extent that each service was running its own cluster of storage engine which hosts only one database.
for instance service A and service B use elastic search (ES) with 3 indices each. they have their own ES cluster on which 3 indices are hosted. Same is the case with other services which use mongodb or postgres where 1-2 collections are hosted on a single mongo cluster or 7-8 tables are managed by one postgres cluster
I'm confused with they way they have followed the design. to my understanding, a separate database does not mean a separate database engine instance. any modern database engine can host multiple databases very easily. a service owns database but not DB engine instance. that not only causes maintenance nightmare but also severe under-utilization of each database engine. Am I correct in my understanding or I have incorrect understanding?
CodePudding user response:
The value in a single database per-service is to allow a service to scale independently of the other services. Having the services share infrastructure like this compromises this, but in a pretty trivial way.
Think about it this way, if one of these services had 100x the load, how hard would it be to provision it's own instance and scale it separately than the rest of the services. As long as collections/databases aren't talking to each other, this is pretty straightforward.
There WILL be a temptation to get these databases talking to each other on the backend. Having multiple databases/collections in a single instance means that reporting in particular is much easier. Resisting that urge to do the simple thing from a reporting standpoint will go a long way toward keeping the separation in place.
TLDR; sharing a database engine is not a problem, as long as you have discipline not to couple the databases together.
CodePudding user response:
The first question is what's a real service that should have a separate logical schema and what are in fact, multiple aspects of a same service that should share a schema to begin with.
Once you decided something (say "micro service") needs its own separate logical schema - The second set of things to think about are the operational concerns of costs, efficiency and management complexity and actual scale needed. These concerns should guide you on whether you set separate or shared engines not worrying about your developers.
You can use other mechanisms to handle the problem of developers connecting things. They can still connect to the to separate databases and get data directly from them rather than go through APIs etc. so you should either trust your developers, do code reviews, provide different connections and roles that will prevent a connection to one schema from seeing another etc. in any event, this is a governance concern and shouldn't be mixed with an operational one