can there be availability zone to availability zone communication failure but not availability zone to internet communication?
if AZ1 cannot reach AZ2 but both of them are up, maybe due to some issue on AZ1, would AZ1 be able to communicate to internet?
CodePudding user response:
Theoretically, it can happen. Each AZ is an isolated location with its own infrastructure, and therefore has separate and redundant connections to the internet. It's possible for the network link(s) between AZs to fail which would affect inter-AZ communication, but the resources in each AZ would still have internet connectivity provided you have the NAT gateways, NACLs etc configured properly in your VPC.
CodePudding user response:
Not likely - Availability Zones are more than single physical datacentres. One Availability Zone is made up of multiple physical locations, which mesh with each other. Each Availability Zone is also connected to each other through multiple connections. To reach the internet, they transit through the region's Transit Center's. AWS doesn't provide any information on the topic, but I'd reasonably assume that there exists the capability for AZ's to communicate between each other over the internet if the region was completely fractured and isoalted across Transit Centres. All pending everything working as expected, which is never guaranteed.
The AWS Fault Isolation Boundaries whitepaper goes into really interesting details on this topic, with the AWS infrastructure section having a great visualization that explains this in more depth.
The deeper question is what scenario you're considering with the architecture, and how much work it would take to secure against it. In terms of availability, most things can be worked around with Multi-AZ functionality in services, and Autoscaling Groups for EC2 instances, pending software challenges. If the concerns are for a split-brain cluster, running a quorum is still a solid way to prevent a broken consensus, plus other options.