i was reading the blogs for "What about the code? Why does the code need to change when it has to run on multiple machines?". And i came across one line which i am not getting it, can anyone please help me to understand it with simple or any example.
"There should be no static instances in the class. Static instances hold application data and when a particular server goes down, all the static data/state is lost. The app is left in an inconsistent state."
CodePudding user response:
Assuming: Stating instance is an instance which can be at most once per process or context - e.g. in java there is at most one copy of a static class, with all data (or state) that the class contains.
So it is very simple memory model for a static class in a single node/jvm/process. Since there is a single copy of data, it is quite straightforward to reason about it. For example, you one may update the data and every next reader will see the updated information. This is a bit more complicated for multithreaded programs, but is still straightforward comparing to distributed systems.
Clearly in a distributed system, every node may have at most one static class with state. Which means if a system contains several nodes - a distributed system - then there are several copies of data.
Having several copies is a problem. It is hard to reason about such system - every node may have some unique data and data may differ on different node. It is very hard to reason about such data: how it is synced? Availability vs consistency?
For example, take a simple counter. In a single node system, a static instance may keep the score. If one writer increased the counter, the next reader will see the increased value (assuming multithreaded part is implemented correctly, which is not that complicated).
Same counter is a distributed system is much more complicated. A writer may write to one node, but a reader may read from another.
Basically, having state on nodes is a hard problem to solve. This is the primary reason to use some distributed storage layer e.g. Hbase, Cassandra, AWS DynamoDB. All these storage systems have predictable behaviour which helps to reason about correctness of programs.
CodePudding user response:
For example, there are just two servers which accepts payments from clients. Then somebody decided to create static class to be friendly with mutli threading:
public static class Payment
{
public static decimal Amount;
public static bool IsMoneyReceived;
public static string UserName;
}
Then some client, let's call him John, decided to buy something in shop. John sent money and static class has data about this purchase. Some service is going to write data into database from
Payment
class, however, electicity was turned off. Load balancer knows that the server is not responding and redirects John requests to another server which knows nothing about data in
Payment
class.