Make Veeam Instant Recovery use a preferred network

Introduction

In this post, I will be discussing an issue we ran into when leveraging the instant recovery capability of Veeam Backup & Replication (VBR). The issue became apparent when we set up the preferred networks in VBR. The backup jobs and the standard restores both leveraged the preferred network as expected. We ran into an issue with instant recovery. While the mount phase leverages the preferred network this is not the case during the restore phase. That uses the default host management network. To make Veeam instant recovery use a preferred network we had to do some investigation and tweaking. This is what this blog post is all about.

Overview

We have a Hyper-V cluster, shared storage (FC), that acts as our source. We back up to a Scale Out Backup Repository that exists of several extend or standard repository. Next to the management network, all of the target and source nodes have connectivity to one or more 10/25Gbps networks. This is leveraged for CSV, live migrations, storage replication, etc. but also for the backup traffic via the Veeam Backup & Replication preferred network settings.

Make Veeam Instant Recovery use a preferred network
We have 2 preferred networks. This is for redundancy but also because there are different networks in use in the environment.

The IPs for the preferred networks are NOT registered in DNS. Note that the Veeam Backup & Replication server also has connectivity to the preferred networks. The reason for this is described in Optimize the Veeam preferred networks backup initialization speed.

As you might have guessed from the blog post title “Make Veeam Instant Recovery use a preferred network” this all worked pretty much as expected for the backup themselves and standard restores. But when it came to Instant Recovery we noticed that while the mount phase leveraged the preferred network, the actual restore phase did not.

Make Veeam Instant Recovery use a preferred network
Instant VM recovery overview.

To read up on instant recovery go to Instant VM Recovery. But in this blog post, it is time to dive into the log files to figure out what is going on?

To solve the issue we dive into the VBR logs, but also into the logs on both the repository/extent and the Hyper-V server where we restore the VM to. The logs confirmed what we already noticed. For backups and normal restores, it correctly decides to use the preferred network. With instant recovery for some reason, in the restore phase, it selects the default host network which is 1Gbps.

Investigating the logs

Reading logs can seem an intimidating tedious task. The trick is to search for relevant entries and that is something you learn by doing. Combine that with an understanding of the problem and some common sense and you can quickly find what you need to look for. Than it is key to figure out why this could be happening. Sometimes that doesn’t work out. In that case, you contact Veeam support. That’s what I did as I knew well what the issue was and I could see this reflected in the logs. But I did not know how to handle this one.

We will look at the logs on the VBR Server, the repository where the backup files of the VM live, and the Hyper-V node where we restore the VM to investigate this issue.

The VBR log

Let’s look at the restore log of the virtual machine for which we perform instant recovery on the VBR server. We notice the following.

The actual restore phase of Instant Recovery leveraging the 1Gbps default host management network

The repository logs

These are the logs of our repository or extent where the restore reads the backup data from. There are actually 2 logs. One is the mount log and the other is the restore log.

We first dive into the Agent.IR.DidierTest08.Mount.Backup-Side.log of our test VM instant recovery. Here we can see connections to our Hyper-V server node where instant recover the test VM over the preferred network. Note that is is the Hyper-V server node that acts as the client!

Agent.IR.DidierTest08.Mount.Backup-Side.log

Let ‘s now parse the Agent.IR.DidierTest08.Restore.Backup-.Side.log of our test VM instant recovery. No matter how hard we look we cannot find any connection attempt, let alone a connection to a preferred network (10.10.110.0/24). We do see the restore work over the default management network (10.18.0.0/16). Also note here that it is the repository node that connects to the Hyper-V node (10.18.230.5), it acts as a client now.

Agent.IR.DidierTest08.Restore.Backup-Side.log

The restore target log

This is the Hyper-V server to where we restore the virtual machine. There are multiple logs but we are most interested in the mount log and the restore log.

We first dive into the Agent.IR.DidierTest08.Mount.HyperV-Side.log of our test VM instant recovery.The mount log shows what we already know. It also shows that it is the Hyper-V server that initiates the connection. This does leverage the preferred network (10.10.110.0/24).

Agent.IR.DidierTest08.Mount.HyperV-Side.log

It also shows the mount phase does leverage the preferred network (10.10.110.0/24).

Agent.IR.DidierTest08.Mount.HyperV-Side.log

But when we look at the restore phase log Agent.IR.DidierTest08.Restore.HyperV-Side.log we again see that the default host management network is used instead of the preferred network.

Agent.IR.DidierTest08.Restore.HyperV-Side.log

Again, we see that the Hyper-V server node during the restore phase acts as the server while the repository is the client (10.18.217.5).

Summary of our findings

Based on our observations on the servers (networks used) and investigating the logs we conclude the following. During an instant recovery, the VM is mounted on the Hyper-V host (where the checkpoint is taken). During the mount phase, the Hyper-V host acts as the client, while the repository acts as the client. This leverages the preferred network. Now, during the restore phase, the repository acts as the client and connects to the Hyper-V host that acts as the server. This does not leverage the preferred network.

This indicates that the solution might lie in reversing the client/server direction for the restore phase of Instant Recovery. But how? Well, there is a setting in Veeam where we can do just this.

Make Veeam Instant Recovery use a preferred network

I have to thank the Veeam support engineer that worked on this with me. He investigated the logs that I sent him as well but with more insight than I have. Those were clean logs just showing reproductions of the issue in combination with a Camtasia Video of it all. That way I showed him what was happening and what I saw while he also had the matching logs to what he was looking at.

Sure enough, he came back with a fix or workaround if you like. To make sure instant recovery leverages the preferred network we needed to do the following. On each extent, in properties, under credentials go to network settings and check “Run server on this side” under “Preferred TCP connection role”.

Make Veeam Instant Recovery use a preferred network
On each extend, in properties, under credentials go to network settings and check Run server on this side under “Preferred TCP connection role”.

The “normal” use cases are for example when most the repository FQDN resolves into several IP addresses and Hyper-V FQDN is resolved into 1 only. This was not the case in our setup, the preferred networks are not registered in DNS. But here leveraging the capability to set “Run server on this side” solves our issue as well.

Parse the logs with “Run server on this side” enabled on the repository/extents

When we start a clean test and rerun an Instant Recovery of our test VM we now see that the restore phase does leverage the preferred network. The “Run server on this side” setting is also reflected in the restore phase logs on both the repository and the backup server.

Agent.IR.DidierTest08.Restore.Backup-Side.log. The Hyper-V server is the client (10.10.110.211) connects to the server, which is the repository.
Agent.IR.DidierTest08.Restore.HyperV-Side.log. The server is now indeed the repository (10.10.110.14) and the client is the Hyper-V server node.

In the VBR log itself, we notice the “Run server on this side” has indeed been enabled.

Host ‘REPOSITORYSERVER’ should be server, reversing connection

IR.DidierTest08.Mount.log

In the Agent.IR.DidierTest08.Restore.Backup-Side.log on the repository server, we also see this setting reflected.

Agent.IR.DidierTest08.Restore.Backup-Side.log

Based on the documentation about “Run server on this side” in https://helpcenter.veeam.com/docs/backup/hyperv/hv_server_credentials.html?ver=95u4 you would assume this is only needed in scenarios where NAT is in play. But this doesn’t cover all use cases. Enabling this checkbox on a server means it does not initiate the connection but waits for the incoming connection from its partner. In our case that also causes the preferred network to be picked up. Apparently, all that is needed is to make sure the Hyper-V hosts to where we restore act as the client and initializes the connection to the server, our repository or extents in SOBR.

Conclusion

We achieved a successful result. Our instant recoveries now also leverage the preferred network. In this use case, this is really important as multiple concurrent instant recoveries are part of the recovery plan. That’s why we have performant storage solutions for our backup and source in combination with high bandwidth on a capable network. In the end, it all worked out well with a minor tweak to make Veeam Instant Recovery use a preferred network. This was however unexpected. I hope that Veeam dives into this issue and sees if the logic can be improved in future updates to make this tweak unnecessary. If I ever hear any feedback on this I will let you know.

Protecting your Veeam Backup and Replication Server is critical

Introduction

In this blog, we will demonstrate one of the things that can go wrong when someone gets a hold of your Veeam Backup & Replication server administrative credentials. They can do more than “just” delete all your backups, replicas, etc. When they can logon to the Veeam Backup & Replication Server itself they can also grab all the credentials form the Veeam configuration database. Those credentials normally have privileges that you do not want to fall into the wrong hands. These are quite literally the keys to the kingdom. Hence, protecting your Veeam Backup & Replication Server is critical.

Protecting your Veeam Backup & Replication Server is critical

Security is not about one feature, technology or action. It takes a more holistic approach. It requires physical security to start with. You also need to adhere to the principles of least privilege rigorously. All this while locking down access, reducing the attack surface, leveraging segmentation, etc.

A key element lies in prevention. You must avoid the harvesting of those credentials. For this reason, you absolutely must practice privileged credential hygiene. Today you also want to leverage multi-factor authentication in order to protect access even better. All this, and more, prevents unauthorized access in the first place. Even when one measure fails. Read Veeam Backup & Replication 9.5 Update 3 — Infrastructure Hardening for more details on this.

Protecting your Veeam Backup & Replication Server is critical
Add MFA to portect your credentials being abused when compromised

Veeam Backup & Replication itself requires credentials to do its work of protecting data and workloads. Access to servers, proxies, repositories, interaction with virtual machines, etc. cannot happen without such credentials. Veeam encrypts the passwords of these users via strong encryption. They use the Microsoft CryptoAPI (FIPS certified) with the machine-specific encryption key for this.

As a side note, you might have seen the big fuss around the critical vulnerability in January 2020 regarding CryptoAPI. This is a reminder of why you need to keep your systems patched.

Protecting your Veeam Backup & Replication Server is critical
CryptoAPI

It ensures decryption of those passwords on another host than the one were encrypting them happened, fails. This means that even if someone steals the configuration database, or in some shape, way or form gets a hold of the encrypted password in the database they cannot be decrypted. This is an industry-standard and quite safe. What you need to know is that when someone gains access to your machine with local administrative rights, all bets are off.

What can happen?

The moment an attacker logs on to the Veeam Backup & Replication server with administrative rights, it is game over.

They will be able to grant themselves access to SQL Server and query it for the credentials. With that information, all they need to do is load and use a Veeam DLL to decrypt them. When this runs on the server where you encrypted them, this will succeed. If anyone would get hold of the encrypted password and tries to decrypt them on another host this will fail as that host has the wrong machine-specific key.

Let me emphasize once more that this is not a insecure implementation by Veeam. When you store encrypted passwords for a service, that service must be able to decrypt them. Otherwise, they can never use them. You cannot get the passwords via the GUI or the Veeam PowerShell commands. But via code, this is quite possible.

Sample code to proof that protecting your Veeam Backup & Replication Server is critical

I assembled a little PowerShell script that grabs the data from the Veeam configuration database. For this purpose, I filter out the passwords that have an empty string. We loop through the ones that remain and decrypt the passwords. In the end, I decided not to post the script as it might help people with bad intentions. I know it won’t stop bad actors cold in their tracks and maybe I will update this post later. But for now, did not include it.

Sorry, right now you can only see the output below from a example VBR Server

In the screenshot below you can see the results. This is a demo lab with demo credentials, so no worries about showing this to you. Remember that you can only decrypt the password on the Veeam Backup & Replication server where you encrypted them.

Protecting your Veeam Backup & Replication Server is critical

There they are, the users with the encrypted and decrypted passwords

To prove a point we will grab the encrypted passwords and try to decrypt them on another VBR server so we have around to do so. This fails with an Exception calling “GetLocalString” with “1” argument(s): “Key not valid for use in specified state. error.

Protecting your Veeam Backup & Replication Server is critical
No matter what encrypted password you try to decrypt on another host it will fail as you don’t have the correct machine specific key.

As you can see even if you get a hold of the encrypted strings they cannot be decrypted on another machine. You must do this on the machine that encrypted them.

Conclusion

While to some this might be a shock when they first learn of this., it is not a gaping security hole. It just shows you that security is more than encryption. It takes multiple measures on multiple levels to protect assets. I repeat, protecting your Veeam Backup & Replication Server is critical. For many people, this is indeed an eye-opener. The lesson is that you must protect your assets adequately. Do not bank on one feature to hold off any and all threats by itself. That is asking for the impossible.

I do hope that all Veeam software itself will also support MFA in the future. That would also help protect access via the Veeam Backup & Replication console.

I am a Veeam Vanguard 2020

I am a Veeam Vanguard 2020

In these somewhat disconcerting times (Corona & Covid-19, for those reading this 2010 years time) good news still happens. Early this evening I got an e-mail from Nikola Pejková. It stated that my nomination for Veeam Vanguard 2020 was approved.

The Veeam Vanguard logo and program

This, my dear readers, puts a smile on my face and makes me happy. It is a great program to be in. We share experiences, knowledge and learn with and from each other. This translates into better insights, designs, and implementations of Veeam solutions. Not only for ourselves, but also for our employers, customers and the community at large. With them, we share all this knowledge. This happens via blogging, user groups, speaking at conferences, webinars, webcasts, showcasts, podcasts, etc.

What does this mean?

We get access to key Veeam personnel who share their extensive insights with us. Their names read like a list of the top 25 people in the backup world today. Danny Allan, CTO and SVP at Veeam (Executive sponsor of the Vanguard program) is one of them. Then we have Anthony Spiteri, Michael Cade, David Hill, Karinne Bessette, Melissa Palmer, Rick Vanover, Technologist staff at Veeam. They, together with Kirsten Stoner, Dmitry Kniazev, Andrew Zhelezko, Nikola Pejková, Technical Analyst staff at Veeam get the honor of herding us cats. Last but not least at all are Anton Gostev, Senior Vice President, Product Management and Mike Resseler, Director, Product Management.

Anton Gostev sharing his insights with us at the Veeam Vanguard Summit 2019

On top of that, we are invited to Join the Veeam Vanguard Summit where we spent some intensive days in briefings and discussions with that team. It is quite an experience to be there. First and foremost, I am both humbled and proud to get this opportunity again this year. The chance to be part of all that comes with the fact that I am a Veeam Vanguard 2020. Second, I get the opportunity once more to pick the brains of the best and provide feedback on how we see, experience and use Veeam solutions is priceless.

Finally, thank you for the opportunity, thanks for the trust and I am looking forward to working with Veeam and my fellow Vanguards for another year. I hope to see you all in good health this year!

The Coronavirus and Covid-19

The Coronavirus and Covid-19

I guess The Coronavirus and Covid-19 do not need an introduction anymore. The virus and the illness it brings are literally the enemies at the gates right now.

red and white flower petals
The Coronavirus (Photo by CDC on Unsplash)

We need to muster all resources to defeat this enemy and return to our normal lives. But normal should mean better lives for us all. A life where our systems work for all of us and not just to enrich the happy few. What that leads to is very clear, and it saddens me to see that happen. Now many hope this will happen after we have dealt with all this, but I am not that optimistic. I see some of our so-called top politicians slinging mud like the perfect narcissistic egos they are instead of focusing on the mission. Sad people, but if they act like this now, it holds little promise for the post-Corona crisis era.

The good news is that scientists, medical professionals, and the first line responders, as well as distribution, utility and sanitary services, are holding the line. Not only do they show more statesmanship than some politicians, but they also come into work and get at it. Many of them in low paying so-called essential or much needed but unimportant and undervalued jobs. But right now the focus is on fighting the Coronavirus and Covid-19. At least it is with all people who have common sense.

So what needs to be done?

Listen to the experts. Do your part. Act locally, think globally. I see people helping each other, self-isolating, practicing social distancing, taking care of the children and elderly while telecommuting from home. A friend of mine is working on protein folding and trying to predict the future mutations in order to see how that affects antibodies already found. Doctors and nurses taking care of my elderly mother keep coming to the house when needed. Despite the fact they had to remove their medical marking from their car to prevent theft of their medical masks and gloves. But the good so far is way more prevalent than the bad.

What do I do?

Personally, I heed all warnings and follow the advice of the quiet professionals in our medical and scientific community. I do not know any better just because I’m an IT architect who accidentally once in his life graduated as a Master in Biology (Zoology, Biochemistry, and Physiology) and did a sting as a research assistant. You might be smart and good, but this is not our area of expertise. And given the issue at hand, you should listen to those that master this field right now. That’s the WHO, the medical and the scientific community.

I am the caretaker of my elderly mother. It’s evident that I keep doing that. That means I self isolate strictly and work from home 100% to minimize risks to her and my loved ones.

I have increased monitoring and alertness for any kind of IT or security problem. My colleagues are doing that as well and handling the calls from many people less knowledgeable in IT matters that are struggling a bit to work from home. We are used to telecommuting and our operations have not missed a beat. We can deploy extra resources and having everything built redundantly gives a bit more peace of mind nowadays. The least we can do is not to make matters worse. We are at work spread out over the day so we have each other’s back and can help out. If all is going well we just keep working on other projects or tasks.

What did we lose?

Material stuff and experience. Conferences and travels got canceled. Money got lost. Spirits got a punch in the nose (for me missing the MVP Summit and quite possibly my vacation). However, that also means businesses and jobs in danger or already lost. But in comparison to the sick and deceased, nothing that we can’t cope with or fix. If you have your health and can work. Be happy about that and carry on.

A small helping hand

I also started helping my protein folding friends out with [email protected]. We stand together alone in our houses and places of work. But while in social isolation we collaborate on many fronts. Do your part where you can, but please leave the cloud resources alone. They are short in supply as we speak. For us, this is a time where our CAPEX model proves to be cheaper and more resilient than our OPEX model (yes we have both).

The Coronavirus and Covid-19
My workstation works on folding proteins while I relax
The Coronavirus and Covid-19
Putting some GPU resources to work fighting Corona

Last but not least I put a bear in front of our window for the kids who take a walk to spot and get happy about.

An action to help kids have fun and smile during the Coronavirus measures

Do your part. No matter how small it might seem. Your kindness, your extra effort, that one extra thing you do might make the difference for someone or for many. Remember, no matter how bad it looks or how depressed you feel, you can always quit later. Just not now, keep going. Together we can beat the Coronavirus and Covid-19.