Disclaimer: Although the design approaches discussed in this blog are from real migration projects and not theoretical, still, this is not an official Cisco validated design document. Therefore, all the designs and recommendations provided in this blog must be verified and tested before considering it in a production environment.
First of all, let’s agree that if any SDWAN vendor tells you my SDWAN solution will make your life easy its “plug and play” to migrate your existing WAN, simply don’t do it : )
If you have large enterprise, and possibly regional, national or global WAN with different connectivity models, like the ones illustrated below.
The goal is to move in to SDWAN enabled WAN connectivity that is transport independent and application aware, as shown below.
The difference looks very simple, only add the SDWAN overlay. This is true, but the fact, its only as simple as this, in the diagram and PowerPoint slides!
The reality is, in this example if we have a quick look at the end to end architecture, we will find there are multiple routing protocols as well multiple entry and exist points (potential routing information loops) as summarized below.
That’s why, you must review the WAN routing architecture & protocols as well as review the routing design of the DC and LAN side of the remote sites and look the end to end integration and traffic flow, then find out how the SDWAN overlay will glue all of these PINs and routing protocols together without introducing routing loops or suboptimal routing.
Yes, SDWAN zero-touch provisioning, centralized control plane and management etc. will make things easier. But without proper planning and deep analysis of the end to end routing architecture, the risk is going to be high.
The good news is, with the Cisco SDWAN (Viptela) life is much easier, there is minimal complexity to deal with IF and I am stressing here on the “IF” you follow a planned approach. “If You Fail to Plan, You Plan to Fail”
What do I mean by ‘planned approach’? is to fully understand where to start, how the traffic flow will look like during the migration phase and after, as well as, how to fall back in case of any issue in any of the sites.
In fact, these are generic migration planning and design considerations.
Let’s divide it into three categories or stages:
Note: this blog will refer to the SDWAN edge device as SDWAN capable/enabled router/node, without specifying the type, because it cloud be Cisco ISR4K/1K, Cisco ASR1K, Vedge or virtual appliance/VM. The decision, which one of these to be considered is outside of this blog scope, and it got nothing to do with the migration approach and design, as its applicable to any type of Cisco SDWAN capable node.
First of all, you need to identity the goal of the migration, if its proof of concept, partial migration (in case a customer would like to enable the solution first, in a certain region or sites only for some time) or it’s an actual end to end rollout.
In general, ‘as a rule of thumb’, the starting point should be always the Data Center or the hub site, which is where most of the traffic flows will be terminated if not all. It cloud be a regional hub in case you are dealing with a global network that has multiple hub sites or doing a partial migration.
The reason why this should be always the starting point, simply because this site will be almost always connected to the remote sites in any required topology (hub&Spoke, Star, full mesh etc.). Not to mention, access DC resource, security standards/compliance in many originations mandate the traffic to terminate at the hub/DC site for security inspection. In addition, from design point of view, following a structured approach is key, therefore, in such migration you must start with the hub/DC site, and make sure it’s ready to receive and forward traffic.
Typically, when it’s come to introducing SDWAN (overlay) to the existing DC/hub site you don’t want to introduce any interruption to the existing traffic flows and production environment. Therefore, you should add the SDWAN node (vEdge/cEdge) ‘off -path’ of the current data path, either by connecting it directly to the existing WAN edge node (assuming it is not SP managed) or it can be connected to the WAN aggregation layer in the DC/hub site as illustrated below, and this should have zero interruption or downtime to the existing traffic.
The question here, what if someone decided to use the same hub router at the WAN/Internet edge to act as the SDWAN hub node, is it possible to re-configure the same edge node to do so? The simple answer is, Yes. But keep in mind, that the reconfiguration of the production edge router itself it has some implications such as:
If this is the only option, to mitigate the risk you can consider the following:
To cover the most common connectivity models, let’s consider the network/WAN architecture illustrated below.
In this design, there are three types of remote sites
We will start from the simplest (Type C) to the more complex one (Type B).
Remote Site Type C
There are two options here, first option is as simple as replacing or reconfiguring the existing router with the one that support SDWAN (assuming the router is managed by the enterprise). Still, the considerations highlighted earlier about using the existing production router as the WAN router applicable here.
The second option is to connect the SDWAN enabled router to the existing remote site network as illustrated below.
Since this SDWAN router will be the active FHRP, all LAN traffic will be sent to the SDWAN enabled router and then to the SDWAN overlay. This approach can be used in any of the below scenarios:
Remote Site Type A
Similar to Type C site, here we still can replace/reconfigure the existing WAN router to be SDWAN enabled and all the implications and considerations covered earlier in this blog are applicable.
The second option is the phased approach, as illustrated in the figure below
Phase-1: integrate the SDWAN capable router and direct users’ traffic through FHRP/IGP metric.
You cloud you use two interfaces/sub interface to tie each one to a separate transport/TLOC (you need to ensure each link uses one of the transports over the existing routers using VRFs static routes etc.)
Phase-2: migrate one of the WAN transports to the new SDWAN capable router
Phase-3: move the second transport link to the SDWAN capable router.
Remote Site Type B
Here the approach is a bit different because there are two routers and two WAN transports, even though, you still can replace both routers with SDWAN capable routes, sometimes these type of sites are large or critical ones and downtime is not something acceptable. Therefore, the phased approach is the most common and recommended approach here.
As illustrated in the figure below, you will need first, to integrate an SDWAN capable router and connect it to the service/LAN side as well as to the existing WAN routers using separate interfaces (these can be physical or sub interfaces going through a common L2 switch, depends on the site LAN design and interfaces availability etc.)
In this design migration approach you need to consider:
In all of the above scenarios, failover should be automatic (testing is required)
The second phase will be moving one of the WAN transport links to the SDWAN capable router. While the third phase will be replacing the other existing WAN router with a second SDWAN capable router.
Target topology vs. traffic flow during the migration
Before starting the migration, you should understand what is the targeted traffic flow between the different sites, like the one illustrated below. This will give you a good understanding of how the migration phases may impact the desired traffic flow.
Using the migration approaches discussed earlier in this blog, traffic flows between the migrated and non-migrated sites will pass through the hub site.
For instance, traffic between migrated site Type-B to a non-migrated site Type-C will traverse the hub/DC site first. If this is VIOP or Video traffic flows, typically it is not the most efficient way to send traffic over the WAN.
Even though this might be acceptable by many companies as an interim solution, it may not be acceptable by some. In which you need to provide a migration approach that keeps the traffic between the sites direct, whether they are migrated or not.
In order to achieve this, you will need some additional interfaces and routing sessions.
As illustrated below, in site Type B, two additional interfaces ( service/LAN side) added to the SDWAN node, along with BGP peering with the edge routers.
These edge routers will advertise the remote sites prefixes as well as advertise the local LAN network(s). In which traffic form the LAN going to a non-migrated site, will first hit the SDWAN router (based on FHRP or IGP tuning from the LAN side) and then SDWAN router will use the existing WAN router over the BGP session over the newly added interface to reach that site directly (non-overly). On the other hand, to prevent reaching migrated sites over the existing legacy path, BGP filtering (prefix list) can be used to add each migrated site subnet and filter out these prefixes from being installed in the SDWAN node BGP table. Therefore, the only path for these site will be over the SDWAN overlay. You can try to be creative and use some BGP tricks with conditional route advertisement etc. but these tricks need to be tested for sure
Still you need to consider the impact of the bi-directional redistribution between OMP and OSPF/BGP in this type of sites, with Cisco SDWAM BGP AS propagation, BGP SoS and OSPF DN-bit help to detect and prevent such loops.
Last but not least, to get things a bit more complex, consider the scenario below, where you inject a local site route over OMP all the way to the DC and back to the DC BGP domain. And at the existing DC edge, exiting (non-SDWAN) WAN routers will advertise these routes over BGP back to the WAN.. here we have a potential routing information loops!
To avoid this issue, you can either filter out remote sites prefixes from being re advertise back to the WAN using prefix list or BGP communities etc. or you can assign and AS to the OMP and enable AS Propagation and simply you filter out any route with AS number assigned to the OMP from being be sent back to the WAN over BGP. Moreover, with Cisco SDWAN, OSPF external routes, will not be injected by default into OMP unless its manually enabled to avoid loops. This scenario and more was covered in greater details by Khalid Raza at Ciscolvie session: SD-WAN Routing Migrations – BRKRST-2095.
Plan Plan Plan, Test Then enjoy Cisco SDWAN advanced routing features and analytics!
As a side note, if you have considered the new CCDE publication CCDE Practice Design Scenarios, expect some migration to SDWAN design questions to be added soon 🙂
Further learning about Cisco SDWAN
Network Evolution for the Cloud and Digital Era – SD-WAN Training Videos at the Cisco Learning Network