I'm trying to better understand SSH tunnels as they are used for relaying communications between a developer's local machine and a database running in the cloud (AWS, GCP, etc.).
I've been seeing lots of examples where you doing something like:
ssh -NL 1433:some-db.us-east-1.rds.amazonaws.com:1433 \
[email protected] -i ~/myapp.pem
And then locally, connect to my some-db
specifing a host of 127.0.0.1
.
I assume this is to connect to cloud DBs (RDS, etc.) that are living inside a VPC, but I'd like to really understand what's going on here. For instance:
- The
ec2-some-ip.compute-1.amazonaws.com
machine appears to be some type of bastion or jumpbox, but I'm unsure of what role it is playing in this equation - I can't figure out what's going on with the ports (1433 is the DBs port but SSH is over 22)
- I assume that doing this creates a port forwarding of some type that forwards all local traffic going to/from port 1433 to port 22, which is probably how we're able to connect to the cloud DB at 127.0.0.1, but...what does the roundtrip traffic look like here? Local/1443 -> SSH/22 -> Cloud/1433? I'm not understanding that
If anyone could take the time to ELI5 and help me understand what's happening here and how it all works I'd be greatly appreciative!
CodePudding user response:
I will try to answer your questions listed here, and I highly recommend that you read the SSH Tunneling Explained post suggested by @jarmod in the comment to gain a more thorough understanding.
The
ec2-some-ip.compute-1.amazonaws.com
machine appears to be some type of bastion or jumpbox, but I'm unsure of what role it is playing in this equation
Yes, the purpose of this machine is to forward the traffic to the cloud DB. This machine cannot be a random machine out there on the internet. This machine is specifically chosen and must meet the following criteria:
- It has SSH server installed.
- It is accessible by your local dev machine.
- It can connect to the cloud DB.
Criteria 1 makes sure that your local dev machine can talk to the bastion host (ec2-some-ip.compute-1.amazonaws.com
machine) using SSH protocol. Criteria 2 makes sure that your local dev machine can SSH to the bastion host (ec2-some-ip.compute-1.amazonaws.com
machine), and when you connect to 127.0.0.1:1433
, traffic can be forwarded to the port 22 of the bastion host (ec2-some-ip.compute-1.amazonaws.com:22
). Criteria 3 makes sure that the traffic can get to its final destination, the cloud DB.
I can't figure out what's going on with the ports (1433 is the DBs port but SSH is over 22)
In order to understand the traffic flow, you need to know 3 ports that are involved in this process.
- Port 1433 on your local dev machine
- Port 22 on the bastion host (
ec2-some-ip.compute-1.amazonaws.com
machine) - Port 1433 on the cloud DB machine
I will use notations LOCAL:1433
, BASTION:22
and CLOUDDB:1433
to represent 1, 2 and 3 respectively.
... what does the roundtrip traffic look like here? ...
Forward pass: Local DB client (arbitrary port) --connects to--> LOCAL:1433
--SSH encrypts and forwards the traffic to--> BASTION:22
--SSH decrypts and forwards the traffic to--> CLOUDDB:1433
Backward pass: CLOUDDB:1433
-> BASTION:22
--SSH encrypts and forwards the traffic to--> LOCAL:1433
--SSH decrypts and forwards the traffic to--> Local DB client (arbitrary port)
Note: This is by no means an exhaustive list of ports used in the SSH forwarding process. There are intermediate ports used and not illustrated in the diagram above.