Subnets, Public vs. Private
Within a VPC, you may define sub-regions of the network. These are subnets. Subnets also use CIDRs to define the range of IP addresses which they will use when something new is deployed within them. Using the example from above, a VPC with a CIDR of 10.0.0.0/16 would need subnets which have the form:
These are just examples…there are many, many more examples. The mask at the end of a CIDR (i.e., /8, /16, /22, 24) determines the range of your subnet’s IP addresses and ultimately how many resources may fit into a subnet. CIDRs are really just bit math, which I’ll skip.
There are two flavors of subnets, private and public. There isn’t anything special about the IP range and public vs. private…it’s entirely up to you to configure these. So, what exactly are the differences? This definition come straight from the AWS docs about VPCs and Subnets.
If a subnet’s traffic is routed to an internet gateway, the subnet is known as a public subnet.
If a subnet doesn’t have a route to the internet gateway, the subnet is known as a private subnet.
Let’s drill into that.
Public Subnet Routing
A public subnet routes public outbound traffic through an Internet Gateway, which is a system that AWS manages for you. Any traffic with a destination of 10.0.0.0/16 will be routed locally, using VPC-internal routing. For traffic with any other destination, routing will go through the Internet Gateway. This routing is transparent to you and allows your resource to contact the public internet.
If we were to look at the Routing Table for a public subnet, it would look like this:
One important thing to note here is that any resource in a public subnet can reach external resourced provided they have a public IP address. This is noted right alongside the docs referenced above:
If you want your instance in a public subnet to communicate with the internet over IPv4, it must have a public IPv4 address or an Elastic IP address (IPv4).
So, if you launch an EC2 instance in a public subnet, and that instance doesn’t have a public IP address, it won’t be able to communicate with the public internet.
Private Subnet Routing
A private subnet doesn’t route traffic through an internet gateway, which means that it cannot connect to any external resources. It’s entirely possible for you to create a subnet which routes10.0.0.0/16 traffic within your VPC without any problems, using the internal networking. However, if a system on this subnet wanted to connect to the outside world, there would be no route for it to get out.
The route table for a private subnet without network access would look like the following:
Just as a private subnet resource can’t get out, nothing from the outside world can get in. Worried about something hacking into your Postgres RDS instance or your Redis cluster? A best practice is to put systems like this in your private subnets. Now, you needn’t worry (as much) about someone hacking into them from the outside. Placing resources into a private subnet means that there is practically no risk of someone hacking directly into your database from the outside world. From a networking perspective, it’s impossible to connect to private subnet systems from outside your VPC.
So, given the case that we have an RDS instance on private subnets and a Lambda function which needs to communicate with it, what do we do? My previous post discusses how to set that up. But what do we do when our Lambda function needs to communicate with RDS and the public internet? That’s the whole point of this post. It took a while to get here, but the background story is necessary!
The answer is, change the route table to route outbound traffic through a NAT Gateway.
A NAT gateway is a resource managed by AWS which does Network Address Translation (NAT) for us, and also provides a public Elastic IP address. Once we have a NAT gateway for a given AZ, we need to setup our private subnets to use them. This change consists of adding an entry to our private subnet’s route table, to route any non-internal traffic through the NAT:
With this small change, anything in our private subnets can:
- Communicate within our VPC’s Private or public subnets resources on the local network using private IPs
- Communicate with the outside world via the NAT gateway
For example, something on this private subnet needs to talk to the host with a private IP of 10.0.1.23. Looking at the route table, we can see that it can go directly to that host since it has a direct connection via local networking. Next, the same system needs to go out and make an API call to github.com. Since github.com is not on the 10.0.0.0/16 network, the packets are routed to the NAT gateway. On our behalf, the NAT will route our packets to github.com, and when GitHub responds, it will route the response packets back to us. There are many details which make this work which are not important for this discussion. Just know what the NAT gateway is doing for you and you’ll be good.
Another interesting result of this change is that our Lambda functions will have a fixed inbound IP address when connected to external resources. That IP address will be the IP of our NAT gateway, which typically is an Elastic IP. This has the added benefit of giving our Lambda functions (mostly) static IPs if you even face the situation where IPs need to be white-listed.
Note: I’m working on a project now where we need to talk to the Salesforce API. Every IP we connect from needs to be white listed. NAT gateways is our solution to this when talking to Salesforce via Lambda functions.
VPC and Subnet Design
With any sort of system where you care about uptime, it’s crucial to set your network up across availability zones. Rather than deploying all of your systems into, say, us-west-2a, you would need to deploy across AZs, including at least one more of us-west-2b or us-west-2c. Why? Remember back when I told you that AZs are independent physical systems (buildings, I presume) that AWS manages for you, and that if one AZ goes down, the others should still function? Well, if you’ve deployed your entire infrastructure to us-west-2a and that AZ goes down, so to does your entire system.
A standard practice for real system is to create public/private subnets across multiple AZs and then deploy resources across these AZs. My nifty little diagram attempts to illustrate this. This diagram shows a single VPC with a CIDR of 10.0.0.0/16 deployed in the Oregon region, which is us-west-2. This VPC contains three pairs of public/private subnets, one in each of the three availability zones.
In this scenario, a single RDS instance is deployed across all three of the private subnets. Honestly, I don’t know the details on how this is handled, but deploying RDS across three AZs ensures our database will stay up even when an AZ goes down.
Alongside RDS is a Lambda function, which is also deployed across three AZs. In the public subnet are three EC2 instances which we’ll assume are all doing the same thing. We deploy three of them to cover ourselves in the case that an AZ goes down.
In order to allow our Lambda functions outbound internet access, we need to create a NAT gateway in each AZ. That is important to remember…VPCs may span AZs, but NAT gateways do not. There is a 1-to-1 mapping with NAT gateways and availability zones, or more accurately, private subnets.
Phew…that’s a lot. I can promise you that if you intend to stick with AWS architecture, knowledge of VPCs is a must. What I covered above will get you a very long way, especially when dealing with serverless systems and/or a typical SaaS application.
Once armed with this knowledge, the next hurdle is learning how to exactly create and manage all of these pieces. Each piece on it’s own isn’t terribly complex, but when it’s time to create a VPC from scratch there are a lot of pieces to the puzzle, all of which must fit together properly in order for your system to work.
In my next post I’d like to cover Stacker, which is Python application which helps manage CloudFormation scripts. Stacker has several built-in “blueprints” for common systems like VPCs. Creating a new VPC from scratch isn’t for the faint of heart if you’re new to it, but Stacker can do pretty much all of the hard work.