Do Azure VPN Gateways that leverage BGP support BFD?

Introduction

Anyone doing redundant, high-available VPN gateways leveraging BGP (Border Gateway Protocol) has encountered BFD (Bi-Directional Forwarding Detection). That said, BFD is not limited to BGP but also works with OSPF and OSPF6. But before we answer whether Azure VPN Gateways that leverage BGP support BFD, l briefly discuss what BFD does.

Bi-Directional Forwarding Detection

The BFD (Bi-Directional Forwarding Detection) protocol provides high-speed and efficient detection for link failures. It works even when the physical link has no failure detection support itself. As such, it helps routing protocols, such as BGP, failover much quicker than they could achieve if left to their own devices.

BFD control packets are transmitted via UDP from source ports between 49152-65535 to destination port 3784 (single-hop, RFC 5880, RFC 5881, and RFC 5882) or 4784 (multi-hop, RFC 5883). It can be IPv4 as well as IPv6. See Bidirectional Forwarding Detection on Wikipedia for more information. Note that this works between directly connected routers (single-hop) or (multi-hop).

Azure VPN Gateways that leverage BGP support BFD

Currently, OPNsense and pfSense, with the FRR (Flexible Rigid Routing) plugin, support BFD integration with BGP, Open Shortest Path First (OSPF), and Open Shortest Path First version 6 (OSPF6). Naturally, most vendors support this, but I mention OPNsense and pfSense because they offer free, fully functional products that are very handy for demos and lab testing.

Do Azure VPN Gateways that leverage BGP support BFD?

TL;DR: No.

You do not find much information when you search for BFD information about Microsoft Azure networking. Only for Azure ExpressRoute does Microsoft clearly state that it is supported and provides information.

But what about Azure VPN gateways with BFD? Well, no, that is not supported at all. You can try to set it up, but your VPN Gateways on-premises is shouting into a void. The session status with the peers will always be “down.” It just won’t work.

Essential to my Modern Datacenter Lab: Azure Site 2 Site VPN with a DELL SonicWALL Firewall

If you’re a serious operator in the part of IT that is considered the tip of the spear, i.e. you’re the one getting things done, you need a lab. I have had one (well I upgraded it a couple of times) for a long time. When you’re dealing with cloud as an IT Pro, mostly Microsoft Azure in my case, that need has not changed. It enables you gain the knowledge and insights that you can only acquire by experimenting and hands on work, there is no substitute. Sometimes people ask me how I learn. A lab and lots of hands on experimenting is a major component of my self education and training. I put in a lot of time and some money, yes.

Perhaps you have a lab at work, perhaps not, but you do need one. A lab is a highly valuable investment in education for both your employers and yourself. It takes a lot of time, effort and it cost a bit of money. The benefits however are huge and I encourage any employer who has IT staff to sponsor this at the ROI is huge for a relative small TCO.

I love the fact that in a lab you have (and want) complete control over the entire stack so you experiment at will and learn about the solutions you build end to end. You do need to deal with it all but that’s all good, you learn even more, even when at times it’s tough going. Note that a home lab, even with the associated costs, has the added benefit of still being available to you even if you move between employers or between clients.

You can set up a site to site VPN using Windows Server 2012 R2 RRAS (see Site-to-Site VPN in Azure Virtual Network using Windows Server 2012 Routing and Remote Access Service (RRAS) that works. But for for long term lab work and real life implementations you’ll be using other devices. In the SOHO lab I run everything virtualized & I need internet access for other uses cases than the on premises lab. I also like to minimize the  hosts/VMs/appliances I need to have running to save on electricity costs. For enterprise grade solutions you leverage solutions form CISCO, JUNIPER, CheckPoint etc. There is no need for “enterprise grade” solutions in a SOHO or small branch office environment.Those are out of budget & overkill, so I needed something else. There are some options out there but I’m using a DELL SonicWALL NSA 220. This is a quality product for one and I could get my hands on one in a very budget friendly manner. UTMs & the like are not exactly cheap, even without all the subscription, but they don’t exactly cater to the home user normally. You can go higher or lower but I would not go below a TZ-205 (Wireless) which is great value for money and more than up for the task of providing you with the capabilities you need in a home lab.

SonicWALL NSA 220 Wireless-N Appliance

I consider this minimum level as I want 1Gbps (no I do not buy 100Mbps equipment in 2015) and I want wireless to make sure I don’t need to have too many hardware devices in the lab. As said, the benefit over the RRAS solution are that it serves other purposes (UTM) and it can remain running cheaply so you can connect to the lab remotely to fire up your hosts and VMs which you normally power down to safe power.

Microsoft only dynamic routing with a limited number of vendors/devices but that doesn’t mean all others are off limits. You can use them but you’ll have to research the configurations that work instead of downloading the configuration manual or templates from the Microsoft web site, which is still very useful to look at an example configuration, even if it’s another product than you use.

Getting it to work is a multistep proccess:

    1. Set up your Azure virtual network.
    2. Configure your S2S VPN on the SonicWall
    3. Test connectivity between a on premises VM and one in the cloud
    4. Build out your hybrid or public cloud

Here’s a reference to get you started Tutorial: Create a Cross-Premises Virtual Network for Site-to-Site Connectivity I will be sharing my setup for the SonicWall in a later blog post so you can use it as a reference. For now, here’s a schematic overview of my home lab setup to Azure (the IP addresses are fakes). At home I use VDSL and it’s a dynamic IP address so every now and then I need to deal with it changing. I’d love to have a couple of static IP address to play with but that’s not within my budget. I wrote a little Azure scheduled run book that takes care of updating the dynamic IP address in my Azure site-to-site VPN setup. It’s also published on the TechNet Gallery

image

You can build this with WIndows RRAS, any UTM, Firewall etc … device that is a bit more capable than a consumer grade (wireless) router. The nice things is that I’ve had multiple subnets on premises and the 10 tunnels in a standard Azure  site-to-site VPN accommodate that nicely. The subnets I don’t want to see in a tunnel to azure I just leave out of the configuration.

Tip to save money in your Azure lab for newbies, shut down everything you can when your done. Automate it with PowerShell. I just make sure my hybrid infra is online & the VPN active enough to make sure we don’t run into out of sync issues with AD etc.

DELL SonicWALL Site-to-Site VPN Options With Azure Networking

The DELL SonicWALL product range supports both policy based and route based VPN configurations. Specifically for Azure they have a configuration guide out there that will help you configure either.

Technically, networking people prefer to use route based configuration. It’s more flexible to maintain in the long run. As life is not perfect and we do not control the universe, policy based is also used a lot. SonicWALL used to be on the supported list for both a Static and Dynamically route Azure VPN connections. According to this thread it was taken off because some people had reliability issues with performance. I hope this gets fixed soon in a firmware release. Having that support is good for DELL as a lot of people watch that list to consider what they buy and there are not to many vendors on it in the more budget friendly range as it is. The reference in that thread to DELL stating that Route-Based VPN using Tunnel Interface is not supported for third party devices, is true but a bit silly as that’s a blanket statement in the VPN industry where there is a non written rule that you use route based when the devices are of the same brand and you control both points. But when that isn’t the case, you go a policy based VPN, even if that’s less flexible.

My advise is that you should test what works for you, make your choice and accept the consequences. In the end it determines only who’s going to have to fix the problem when it goes wrong. I’m also calling on DELL to sort this out fast & good.

A lot of people get confused when starting out with VPNs. Add Azure into the equation, where we also get confused whilst climbing the learning curve, and things get mixed up. So here a small recap of the state of Azure VPN options:

  • There are two to create a Site-to-Site VPN VPN between an Azure virtual network (and all the subnets it contains) and your on premises network (and the subnets it contains).
    1. Static Routing: this is the one that will work with just about any device that supports policy based VPNs in any reasonable way, which includes a VPN with Windows RRAS.
    2. Dynamic Routing: This one is supported with a lot less vendors, but that doesn’t mean it won’t work. Do your due diligence. This also works with Windows RRAS

Note: Microsoft now has added a a 3rd option to it’s Azure VPN Gateway offerings, the High Performance VPN gateway, for all practical purposes it’s dynamic routing, but a more scalable version. Note that this does NOT support static routing.

The confusion is partially due to Microsoft Azure, network industry and vendor terminology differing from each other. So here’s the translation table for DELL SonicWALL & Azure

Dynamic Routing in Azure Speak is a Route-Based VPN in SonicWALL terminology and is called and is called Tunnel Interface in the policy type settings for a VPN.

image

Static Routing in Azure Speak is a Policy-Based VPN in SonicWALL terminology and is called Site-To-Site in the “Policy Type” settings for a VPN.

image

  • You can only use one. So you need to make sure you won’t mix the two on both sites as that won’t work for sure.
  • Only a Pre-Shared Key (PSK) is currently supported for authentication. There is no support yet for certificate based authentication at the time of writing).

Also note that you can have 10 tunnels in a standard Azure site-to-site VPN which should give you enough wiggling room for some interesting scenarios. If not scale up to the high performance Azure site-to site VPN or move to Express Route. In the screenshot below you can see I have 3 tunnels to Azure from my home lab.

image
I hope this clears out any confusion around that subject!