Home > Back-end >  Can there be a communication failure between availability zones in an AWS region?
Can there be a communication failure between availability zones in an AWS region?

Time:12-21

Is there a possibility for a communication failure between availability zones (AZs) in an AWS region (assuming both AZs are up but only network communication failure)?

If there is a communication failure, is it considered a region failure?

CodePudding user response:

Is a communication failure between 2 AZs possible?
Yes.

It really depends on what we determine as 'communication' but yes, it is possible. Nothing is 100% guaranteed within AWS.


In general, are availability zone failures possible?
Yes.

No system is without fault and while AWS provide some of the best availability guarantees in the cloud market, there is still a small possibility of failures or disruptions occurring in 1 or more availability zones (AZs) beyond just issues relating to communication (whether you define it as internet connection, specific service unavailability in a region, lack of data access in an AZ or anything else).


What's a region failure?
A region failure is loosely defined as all of the AZs within a region becoming unavailable or otherwise unable to provide the expected level of service.


Are communication failures between AZs in a region classed as a region failure?
No and yes.

An AWS region is at least three AZs & in some cases, as much as six AZs all becoming unavailable. Even if two AZs fully fail and go out of service, the region can still respond to requests (though at a degraded level).

If the 'communication failure' means none of the AZs can respond to requests, it would be classed as a region failure.


Is an AWS region failure likely?
Not really.

AWS region outages are quite rare. Some technical failures such as power or internet failure will always inevitably happen, and other natural disasters like earthquakes, floods, tornadoes or the like are never fully in Amazon's control.

As mentioned, this is the probability of at least three AZs going down and in some cases, as much as six AZs unable to provide service. If at least one AZ is still responding to requests, it's not classed as a region failure but more of a service degradation.

Considering each AZ is also one or more discrete data centers with some AZs containing up to five data centers, the risk of a full region failure is very low (but not impossible).

  • Related