Disclaimer: this blog is based on my own view; it is not a company or someone else’s view. Also, it is not intended to show or proves which solution is better, instead, it is intended to show the different networking architectures of each cloud provider that a cloud or network architect needs to take into considerations, especially when designing hybrid model with an enterprise managed SD-WAN Such as Cisco SD-WAN.
At the time of this content writing, you may find this is one of very few, if not the only public blog that you can find today, with comprehensive analysis and visualization of this topic.
The previous blog “WAN routing in the cloud era” focused on the WAN routing ‘To and From’ the cloud. And how Cisco SD-WAN can help, to facilitate this routing.
Nevertheless, in order to be able to interconnect your On-Prem DC/branches using the described Cisco SD-WAN architecture, ideally you need to have a good understanding of the networking in the cloud.
Therefore, this blog focuses on the routing design considerations of two of the largest cloud solutions’ providers (Amazon Web services AWS and Google Cloud platform GCP)
Although, this post is not a deep dive into each of the networking components discussed in this blog, this blog assumes that you have basic to intermediate knowledge of cloud networking concepts.
First of all, both AWS and GCP have the following networking components:
AWS Network Architecture
GCP Network Architecture
From the above illustrations, of the AWS and GCP network architecture, it is obvious there is a few architectural differences:
VPC: in AWS a VPC span multiple availability zones AZs but within a region, while with GCP a VPC can span multiple zones across different regions
Virtual Network/Subnet: a private network/subnet in AWS cannot be extended beyond an AZ, while with GCP a private network/subnet can span zones but within a region only.
Load balancing: in AWS an elastic load balancer ELB is regional or (zonal), in which it can send traffic to instances across multiple AZs within the same region or to a single AZ. ELB can be either application or network ELB. While with GCP Cloud load balancer LB can be external multi regional (two types: application LB and Network LB), internal (network LB) regional and zonal (within a region or zone).
With GCP global LB, you can deploy as HTTP(S), SSL proxy, and TCP proxy load balancers, this helpful in multi-region global architecture where you need a single anycast IPv4/6 address for the one load balancer to map to application instances running across multiple regions. As illustrated in the figure below, in case you are using IPv6 the DNS server has a single AAAA record and in which you don’t need to load balance among multiple IPv6 addresses. Even the caching of AAAA records by clients’ machine/browser won’t be an issue, as there’s only a single address to cache. In this case, clients’ requests to the IPv6 LB VIP address are automatically load balanced to the closest healthy instance with the available capacity.
AWS addresses the same scenario by relying on global DNS (Route 53) and then LB will come into the picture once the traffic land on a given VPC/Region.
From security/firewalling point of view
AWS uses two key network security elements:
Security groups: which is a hypervisor level, virtual firewall that perform stateful packet filtering per VM (AWS EC2). Therefore, each instance in a subnet in a VPC could be assigned to a different set of security groups. If an instance not assigned to a any group when created, this instance will automatically be assigned to the default security group within that VPC.
The default security group, allows inbound traffic from instances assigned to the same security group, as well as Allow all outbound IPv4 (and IPv6 if used) traffic.
According to AWS “Network Access Control list (ACL) is an optional layer of security for your VPC that acts as a firewall for controlling traffic in and out of one or more subnets. You might set up network ACLs with rules similar to your security groups in order to add an additional layer of security to your VPC.” ACL is statless in which you need to allow or deny the traffic in both directions, also it is at the subnet level.
GCP VPC Firewall:
In GCP the firewall is VPC wide, however, also statefule and performs the filtering at the instance level. In which you can allow od deny traffic between VMs within the same subnet.
One of the flexible aspect of GCP firewall, is the ability to perform filtering ( FW rule action) based on instance(s) Tag or a service account in use by the source or target instance. In which you don’t need to build the rules based on IPs. This can be helpful when you have container engines nodes where each nodes cluster can use a different tag.
In GCP the FW by default has implied(hidden) allow to any outbound rule ( with least priority) and deny from any inbound (with least priority) this means an instance is allowed to establish outbound connection to the Internet, but any traffic destined to any instance will be denied if there is no explicit allow rule added.
Like AWS, the default VPC network has some GCP FW rules with allow-inbound for HTTP, HTTPS, SSH, RDP.
GCP Firewall or AWS Security group rules, are important from networking point of view, because without a proper configuration, communication to, from and between instances and users as well will be broken, for example, traffic sourced from the load balancer must be allowed explicitly to reach the intended instances. The same applicable of the health check probes etc.
Routes: In AWS you have the ability to create different routing tables, where you can have a routing table connected to instances that are publicly accessible such web tier EC2 Instances, and a route table that has not internet gateway attached to it, in which it will be accessible by private IPs within the VPC network only such as the DB tier. If these EC2 instances need to reach the Internet for updates etc. then you may need to use a NAT gateway in this case to allow traffic established from these instances to the internet only.
In GCP, there is a routing table per VPC network, however, with GCP you have ability to add routes, and associate with a Tag, in which this route will be visible only to the instances with this Tag. Also you can set priority to make it more or less preferred route. This capability offers great control on routing design, where you can steer traffic to prefer local gateways per region. Any route without an assigned tag, applies to all instances in the network. For example, in the figure below, GCE-2 has a Tag “VPN” and route injected by the VPN Gateway is Tagged with VPN as well, therefore, only GCE-2 can see this route, the internet route has no tag, therefore, all the instances can see and use this route.
Also, it is obvious that with GCP, instances across different regions that belongs to the same VPC can communicate using private IPs. In AWS because a VPC cannot span regions, communication has to be either using public IPs or using a type of interconnecting these VPCs as described below.
Inter VPC routing
Both AWS and GCP provide the ability to interconnect different projects/accounts and its VPCs using VPC peering (Internal communication including AWS inter-region VPC peering), or over VPN Gateway (treated as external VPN/communication). Or traffic may go through a direct connect to On-Prem from one VPC and go back into another to another VPC (this has some limitations and caveats).
Also, AWS support PrivateLink through the set up and use of VPC Endpoints to facilitate the access to services made available by others. In which an organization (i.e. SaaS provider) can now offer services for sale to other AWS customers, where they can access it over a private connection. Technically it’s a service that accepts TCP traffic, and is hosted behind a Load Balancer. This SaaS prodder can publish it directly or through AWS Marketplace.
The SaaS provider and the service consumer typically operate in different VPCs and AWS accounts and communicate only over the endpoint, with all traffic flowing across Amazon’s private network. If the customer has a Direct interconnect, or running Cisco SD-WAN to On-Prem DC/Branches, they can extend this reachability to the DC/Branches etc.
On the other hand, GCP offer the ability to build multi-tier applications managed by separate teams where each reside in its own VPC and project (for better isolation, security, billing etc.) while at the same time those teams/services can share the same virtual network suing what is called “Shared VPC”
According to GCP “Shared VPC allows creation of a VPC network of RFC1918 IP spaces that associated projects can then use. Admins in associated projects can create virtual machine (VM) instances in the shared VPC network spaces. Network and security admins can create VPNs and firewall rules usable by all the projects in the VPC network. Consistent policies can be applied and enforced easily across a Cloud Organization” a project is an organizational unit that can contain multiple VPCs.
The scenario depicted in the figure below is for a Cloud Organization with two-tier web service, an externally published Tier 1 load balancer VIP, and a private Internal load balancer for the Tier 2 App. In this sample scenario, each load balancer and the associated instances are managed by a different team.
Next part of this blog, will focus on the Cloud Direct/Interconnect options, and how it may influence the Enterprise SD-WAN Design.