I made an Azure Virtual WAN custom route tables video to demonstrate and explain its principles. At the moment I am investigating everything that is going on around Azure Virtual WAN. One of the enhanced capabilities that is being rolled out right now at the time of routing, is custom route tables. This was supposed to be done by the week of August the 3rd but apparently there are some delays.
That does not diminish the fact that this is very useful and powerful addition to the capabilities with Azure Virtual WAN Hub routing. I advise you to look into this. Stay in the know as Azure Virtual WAN is taking off. In my opinion it is the future of Azure Networking and it has me really excited about the capabilities.
To help with my own understanding of Azure Virtual WAN and custom routing tables I creates some slide decks to present about it and learn about it whilst doing so. I also created a video.
Azure Virtual WAN hub routing
In the official documentation explains the concept and the principles by which custom route tables work. They also provide some scenarios to help you understand it all. I have taken some of these scenarios and used them to create some step by step walk throughs to help explain how it works. It also helped me wrap my head around it whilst learning about it.
I hope you enjoy the video and will find it useful. As said, interesting times ahead for Azure networking with Azure Virtual WAN moving ahead and becoming a richer and better tool. Take a look and see how you like it.
Until next time, thank you for reading. reach out in the comments or s
When it comes to Azure Virtual WAN, you might have the impression it is only useful for huge, international entities. Entities like the big Fortune 500 companies, with a significant, distributed global presence.
I can understand why. That is where the attention is going, and it makes for excellent examples to showcase. Also, the emphasis with SD-WAN has too often been about such cases. SD-WAN also enables economically feasible, reliable, and redundant connectivity for smaller locations and companies than ever before. My take is that Azure Virtual WAN is for everyone!
Azure Virtual WAN is for everyone
I would also like to emphasize that Azure Virtual WAN is so much more than just SD-WAN. That does not distract from SD-WAN’s value. SD-WAN is a crucial aspect of it in terms of connectivity to and from your Azure environment. I would even say that the ability to leverage Microsoft’s global network via Azure Virtual WAN is the most significant force multiplier that SD-WAN has gotten in the past year.
Network appliance vendors are signing on to integrate with Azure Virtual WAN for a good reason. It makes sense to leverage one of the biggest, best, and fastest global networks in the world to provide connectivity for your customers.
One extreme use case would be to use Azure Virtual WAN only as an SD-WAN carrier just to connect your sites without using anything in Azure. An example of this would be a business that is still on-prem but wants to move to Azure. That is a good start. It modernizes connectivity between the locations while becoming ready to move workloads to Azure, where the landing zone is integrated into Azure Virtual WAN when it is time to do so.
A Medium Enterprise example
But let’s step back a minute. The benefits of Azure Virtual WAN go beyond SD-WAN deployments for multinational companies spanning the globe. Make no mistake about this. SD-WAN is also very interesting for Small and Medium Enterprises (SME), and the benefits of Azure Virtual WAN go beyond on-premises to Azure connectivity. It extends to connecting any location to any location.
On-premises connectivity is more than a data center, a corporate HQ, and branch offices with ExpressRoute and/or Site-to-Site VPN (S2S). It is also a user via a Point-to-Site VPN (P2S). All of these can be anywhere in the world but also distributed across your city, country, or continent. Think about what that means for “remote work by default” shops. Every individual, whether working with you as an employee, partner, customer, consultant or contractor, can be connected to your Azure virtual WAN and your on-premises locations thanks to the any-to-any connectivity.
Some people might have an NGFW at home, depending on their role and needs. Many others will be fine with a point-to-site VPN, which serves both work-from-home profiles as well as road warriors.
People, if this Coronavirus global pandemic has not awakened you to this importance and possibility of remote work, I do not know what to tell you. Drink a lot more coffee?
For example, a national retailer, a school, a medical provider with lots of small local presences can all benefit from Azure Virtual WAN. When they merge with others, within or across the borders, Azure Virtual WAN with SD-WAN puts them in a great position to extend and integrate their network.
There is more to Azure Virtual WAN than SD-WAN
We have not touched on the other benefits Azure Virtual WAN brings. These benefits are there, even if you have no on-premises locations to connect. That would be another extreme, Azure Virtual WAN without any SD-WAN deployment. While the on-premises deployment of apps goes down over time, it will not go ways 100% for everyone. Also, even in a 100% cloud-native environment, having other connectivity options than over the internet and public services can help with security, speed, and cost reduction.
The Any-to-Any capabilities, the ease of use, leading to operational cost saving, are game-changing. Combined with the integration with Azure Firewall manager to create a Secure Virtual HUB and custom routing, it makes for a very flexible way of securing and managing network access and security.
Don’t think that SMEs will only have 2 to 5 subscriptions, or even less if they are just consumers of cloud services outsourced to a service provider, with one or a couple of vNETs.
If you do not have many subscriptions, you can still have a lot of vNETs. You create vNETs per application, business unit, etc. On top of that, in many cases, you will have development, testing, acceptance, and production environments for these applications.
You might very well do what we do, and what we see more of again, lots of subscriptions. You can create subscriptions for every application environment, business unit, etc. The benefits are clear and easy to measure distinction in ownership, responsibilities, costs, and security. That means a company can have dozens to hundreds of subscriptions that way. These can all have multiple vNETs. When an SME wants to protect itself against downtime, two regions come into play. That means that the hub-to-hub transitive nature excels.
Now, managing VNET peering, transit vNETs, Network Gateways, Firewalls, and route tables all become a bit of a chore fast when the environment grows. Rolling all that work into a convenient, centralized virtual global service makes sense to reduce complexity, reduce operational costs, and simplify your network architecture and design.
Going cloud first and cloud native
In a later stage, your organization can reduce its on-premises footprint and go for an all cloud-based approach. Be realistic, there might very well be needs for some on-premises solutions but Azure Stack has you covered there. You can leverage Azure Stack HCI, Edge, or even hub or those needs but still integrate deployment, management, operations, and monitoring into Azure.
Global Transit Architecture with Azure Virtual WAN
I still need to drive the capabilities and benefits of the Global Transit Architecture with Azure Virtual WAN home for you. For one, it is any-to-any by default. You can control and limit this where needed, but it works automagically for you out of the box. Second, this is true for ExpressRoute, S2S VPN, P2S VPN, VNET peers, and virtual hubs in all directions.
ExpressRoute Global Reach and Virtual WAN
This means that a user with a P2S VPN connected to a virtual hub has access to a datacenter that connects to that same hub or another one within the same Virtual WAN. You can go crisscross all over the place. I love it. Remember that we can secure this, control this.
Think about that for a moment. When I am on the road connected via a P2S VPN to an azure virtual hub, I can reach my datacenter (ExpressRoute), my office, store, factory, and potentially even my home office (S2S VPN). Next to that, I can reach all my vNETs. It is the same deal when I am working from home or in the office, store, or factory. That is impressive. The default is any-to-any, automagically done for you. But you can restrict and secure this to your needs with custom routing and a secure virtual hub (Azure Firewall Manager).
The benefits of Azure Virtual WAN are plenty, for many scenarios in large, medium and small enterprises. So, I invite you to take a better look. I did. As a result, I have been investing time in diving into its possibilities and potential. I will be presenting on this topic to share my insights into what, to me, is the future of Azure networking. Do not think this is only for the biggest corporations or organizations.
Yesterday I received the email informing me that I received the Microsoft MVP Award 2020. July 1st is that time of the year that as a Microsoft MVP you find out if you are renewed for the new fiscal year. I was. That was cause for a celebration. But I had to wait a bit to shout out my happiness as I was in a Teams meeting on Veeam Backup for Azure.
I enjoy being a Microsoft MVP
I am thrilled to be awarded as an MVP again. To this day I remain in the Cloud & Datacenter category which is a very good fit for me. As I do indeed work across both worlds. That’s where I help fill the gap to ensure digital transformations go smooth & you don’t lose out wherever you run your solutions. There are many creative solutions to be designed in hybrid scenarios and at the edge. Places where you can investigate, research, and find opportunities to build those creative solutions.
People who follow me know I don’t just copy/paste “best practices.” I research what works best and come up with ways to leverage technologies. I apply out of the box thinking to deliver excellent value for money efficiently and effectively. Then, I share my experiences and what I learn by writing, blogging, and speaking. That includes my successes and failures, as we learn from both.
What do I do?
I like to work end to end. The full-stack. No silos. There is no hiding behind another team or blocking another team. You could describe me as a multi-pronged T. Various prongs go deeper based on need or interest. But there are many and the T is wide so we can act and work without needing to much help to get something going. For one, this also enables me to give feedback with enough real-world knowledge to be valuable. Secondly, it keeps me honest. I do not just do design, I deploy it and support it. It has to work. I dislike support or consulting with tunnel vision or that design only for maximum profit instead of for the need at hand. My approach leaves money for better solutions and saves money in the long run. What I learn and see I take back to Microsoft in feedback, in discussion and interactions with the program managers. That is valuable for me as well as I learn a lot from them as well. In the end, it leads to better products and experiences for all of the community and customers.
I enjoy being a Microsoft MVP for the opportunities it gives me to learn and share with like-minded people from all over the globe. While it takes a village to raise a child, the child needs to get out of the village into the world to evolve and keep learning. Today that is easier then ever before thanks to technology which eliminates many boundaries.
2020 is a bit of a special year
Talking about the globe at the time of writing. In a time of Corona and COVID-19 running amok in the world, it is our technology that makes this still possible while we do not travel en limit ourselves for the better good of all. I am proud to say that our technologies were in place to go in lockdown immediately without having to scramble for solutions. telecommuting is something we did already routinely and technologies could scale up and out, both on-premises and in the cloud, both in the areas where they excel.
That, combined with living in a country where we have universal healthcare and social benefits (taxes for the better good of all) helped ease the blow we all received. We all have shortcomings. But as a nation, businesses, and people we were ready, willing, and able to do what needed to be done.
All this means that this year and next year we do not have an in-person MVP Summit.
That saddens me. The face to face discussions from breakfast till literally in the hotel hallways before we go to sleep are priceless. Those chats with our peers and Microsoft employees are very insightful rewarding and educational. That experience and intensity are hard to recreate in a virtual event. We are all eager to get past lockdown, social distancing, and travel restrictions. We can only achieve this by self-discipline and acting responsibly at our own personal and local level. That and relentless efforts to find a vaccine, which, hopefully, will grant us back some of the privileges we enjoy.
Good intentions for 2020-2021
While I am very happy to receive the Microsoft MVP Award 2020 I want to make sure all recipients feel appreciated and are able to be themselves in our community. I plan to pay extra care into making sure that diversity, inclusiveness, and equity are always on the radar. My extra effort in order to keep the community a welcome and safe place for all.
A small gift
As a special give on the 1st of July, Microsoft made Azure Firewall Manager generally available. I have been working with this in preview the last couple of months. Today I am very pleased I can start using it in production!
Recently I got to diagnose a really interesting Veeam Backup & Replication symptom. Imagine you have a backup environment that runs smoothly. All week long but then, suddenly, running backup jobs stall. News jobs that start do not make an ounce of progress. It is as the state of every job is frozen in time. Let’s investigate and dive into troubleshooting 100% stalled Veeam backup jobs.
Troubleshooting 100% stalled Veeam backup jobs
When looking at the stalled jobs, nothing in the Veeam GUI indicates an error. Looking at the Windows event logs we see no warning, error, or critical messages. All seems fine. As this Veeam environment uses ReFS on storage spaces we are a bit weary. While the bugs that caused slowdowns have been fixed, we are still alert to potential issues. The difference with the know (fixed) ReFS issues that this is no slowdown, No sir, the Veeam backup jobs have literally frozen in time but everything seems to be functional otherwise.
Another symptom of this issue is that the synthetic full backups complete perfectly well, but they finfish with an error message none the less due to a time out. This has no effect on the synthetic backup result (they are usable) but it is disconcerting to see an issue with this.
On top of that, data copies into the ReFS volumes work just fine and at an excellent speed. Via performance monitor, we can see that the rotation of full regions from mirror to parity is also working well once the mirror tier has reached a specified capacity level.
Time to dive into the Veeam logs I would say.
Veeam backup job log
So the next stop is the Veeam logs themselves. While those can seem a little intimidating, they are very useful to scroll through. And sure enough, we find the following in one of the stalled jobs its backup log.
For hours on end … it goes on that way.
VIRTUAL MACHINE TASK LOGS 1
When we look at the task log of ar virtual machine that is still at 0% we see the same reflected there. Note that nothing happens between 22:465 and 05:30, that’s when I disabled and enabled the vNIC of the preferred networks in the VBR virtual machine and it all sprung back to life.
So it is clear we have a network issue of some sort. We checked the repository servers and the Hyper-V cluster but there everything is just fine. So where is it?
Virtual Machine task logs 2
We dive into the task log of one of the virtual machines who’s backing up and that is hanging at 88%. There we see one after the other reconnect to the repository IP (over the preferred network as defined in VB). That also happens all night long until we reset the VBR virtual machine’s preferred network vNICs. In the log snippet below notice the following:
Error A connection attempt failed because the connected party did not properly respond after a period of time, or established connection failed because connected host has failed to respond (System.Net.Sockets.SocketException)
From the logs we deduct that the network error appears to be on the VBR virtual machine itself. This is confirmed by the fact that bouncing the vNICs of the preferred network (10.10.110.x is the preferred network subnet) on the VBR virtual machine kicks the jobs back into action. So what is the issue? So we start checking the network configurations and settings. The switch ports, pNICs, vNIC, vSwitches etc. to find out what’s going on, As it seems to work for days or a week before the issue shows up we suspect a jumbo frame issue so we start there.
While checking the configuration we to make sure jumbo frames are enabled on the vNIC and the pNICs of the vSwitch’s NIC team. That’s when we notice the jumbo frames are missing from those pNICs. So we set those again.
From the VBR virtual machine we run some ping tests. The default works fine.
When we test with jumbo frames however we notice something. The ping tests do not complain about jumbo frames being too large and that with the “do not fragment” option set the “Packet needs to be fragmented but DF set.” Note it just says “request timed out”. This indicates an issue right here, jumbo frames are set but they do not work.
As the requests time out and the ping test does not complain about the jumbo frames we have another issue here than just the jumbo frame settings. It smells of a firmware and/or driver issue. So we dive a bit further. That’s when I notice the driver for the relevant pNICs (Broadcom) is the inbox Windows driver. That’s no good. The inbox drivers only exists to be able to go out and fetch the vendor’s driver and firmware when need, as a courtesy so to speak. We copy those to the hosts that require an update. In this case, the nodes where the VBR virtual machine can run. The firmware update requires a reboot. When the host is up and the VBR virtual machine is running I test again.
Bingo, now a ping test succeeds.
So did we really forget to update the drivers? Did we walk out of the offices to go in lockdown for the Corona crisis and forget about it? In the end, it turned out they did run the updates for the physical hosts. But for some reason, the Broadcom firmware and drivers did not get updated properly. However, that failed update seems to have also removed the Jumbo frame settings from the pNICs that are used for the virtual switch. After fixed both of these we have not seen the issue return.
The preferred networks do not absolutely have to be present on the VBR server itself. Define, yes, present, no. But it speeds up backup job initialization a lot when they are there present on the VBR server and Veeam also indicates to do so in their documentation.
Why jumbo frames? Ah well the networks we use for the preferred networks are end to end jumbo frame enabled. So we maintain this in to the VBR server. We might get away by not setting jumbo frames on the VBR server but we want to be consistent.
It pays to make sure you have all settings correctly configured and are running the latest and greatest known good firmware. But that should have been the case here. And it all worked so well for quite a while before the backup jobs stall. The issue can lie in the details and sometimes things are not what you assume they are. Always verify and verify it again.
I hope this helps someone out there if they are ever troubleshooting 100% stalled Veeam backup jobs If you need help, reach out in the comments. There are a lot of very experienced and respected people around in my network that can help. Maybe even I can lend a hand and learn something along the way.