CBT DRIVER WITH Veeam Agent for Microsoft Windows 2.1

Change Block Tracking comes to physical & IAAS Veeam Backups

With the big improvements and new capabilities delivered in Veeam Backup & Replication 9.5 Update 3 there are some interesting capabilities and features related specifically to the Veeam Agent for Windows 2.1 Server Edition. We now get the ability to manage the Veeam Agent centrally from within VBR 9.5 UP3 console or PowerShell. This includes deploying the new Change Block Tracking (CBT) driver for Windows Server (not Linux).

This CBT driver is optional and works like you have come to expect from Veeam VBR when backing up Hyper-V virtual machines pre-Windows Server 2016. Windows Server 2016 now has its own CBT capabilities that Veeam VBR 9.5 leveraged. The big thing here is that you now get CBT capabilities for physical or virtual in guest workloads (that includes IAAS people!) with Veeam Agent for Windows 2.1 Server Edition.

Deploying the Veeam Agent for Microsoft Windows 2.1 CBT Driver

The Veeam Agent for Windows 2.1 ships with an optional, signed change block tracking filter driver for Windows servers. That agent is included in your VBR 9.5 Update 3 download or you can choose to download an update that does not have the CBT driver included. That’s up to you. I just upgraded my lab and production environment with the agent included as I might have a use for them. If not now, then later and at least my environment is ready for that.

clip_image001

When you have installed VAW 2.1 you can navigate to C:\Program Files\Veeam\Endpoint Backup\CBTDriver and find the driver files there for the supported Windows Server OS versions under their respective folders.

clip_image003

As you can see in the screenshot above we have CBT drivers for any version of Windows Server back to Windows Server 2008 R2. If you are running anything older we really need to talk about your environment in 2018. I mean it.

Note that right clicking the .inf file for your version of Windows Server and selecting Install is the most manual way of installing the CBT driver. You’ll need to reboot the host.

clip_image005

Normally you’ll either integrate the deployment and updating of the CBT driver into the VBR 9.5 Update 3 console or you’ll deploy and update the CBT driver manually.

Install / uninstall the CBT driver via Veeam Backup & Replication Console

You can add servers individually or as part of a protection group (Active Directory based). Whatever option you chose you’ll have the option of managing them via the agent manually or via VBR server. Once you have done that you can deploy and update the optional CBT driver for supported Windows Server versions via the individual servers or the protection groups.

Individual Server

clip_image007

Once the agent is installed you’ll can optionally install the CBT driver. When that’s you can also uninstall the CBIT driver and the agent form the VBR Console.

clip_image009

Protection Group

You can add servers to protect via VAW 2.1 individually, via active directory (domain, organizational unit, container, computer, cluster or a group) or a CSV file with server names /IP-addresses. That’s another subject actually but you get the gist of what a protection group is.

clip_image011

Checking the CBT driver version

You can always check the CBT driver version via the details of a server added to the physical or cloud infrastructure.

clip_image012

Install / uninstall the CBT driver via the standalone Veeam Agent for Windows

My workstation at home isn’t managed by a Veeam Backup & Replication v9.5 Update 3 server. It’s a standalone system. But it does run Windows Server 2016. Now, even while such a standalone system can send its backups to Veeam Repository, I don’t do that at home. The target is a local disk in disk bay that I can easily swap out every week. I just rotate through a couple of recuperated larger HDDs for this purpose and this also allows me to take a backup copy off site. The Veeam Agent for Windows configuration for my home office workstation is done locally, including the installation of the CBT driver. Doing so is easy. Under settings in the we now have a 3rd entry VAW 2.1 that’s there to install the CBT driver if we want to.

clip_image014

When you click install it will be done before you can even blink. It will prompt you the restart the computer to finish installing the driver. Do so. If not, the next backup will complain about failing over to MFT analysis based incremental backups as you can’t use the installed CBT driver yet.

clip_image016

When using the new VAW 2.1 CBT drivers for windows changes get tracked a VCT file. These can be seen under C:\ProgramData\Veeam\EndpointData\CtStore\VctStore.

clip_image018

Ready to Go

I’ll compare the results of backing up my main workhorse with and without the CBT driver installed. Veeam indicates the use case is for servers with a lot of data churn and that’s where you should use them. The idea is that you don’t need to deal with updating the drivers when the benefits are not there. That’s fair enough I’d say but I’m going to experiment a little with them anyway to see what difference I can notice without resorting to a microscope.

If we conclude that having the CBT driver installed is not worth while for our workstation we can easily uninstall it again via the control panel, under settings, where we now see the option to uninstall it. Easy enough.

clip_image020

However, as it can track changes in NTFS as well as ReFS and FAT partitions it might be wise to use it for those servers that have one or more of such volumes, even when for NFTS volumes the speed difference isn’t that significant. Normally the bigger the data churn delta the bigger the benefits of the CBT driver will be.

Frustrations about host level backups of Hyper-V guest clusters with Windows Server 2016

Introduction

With Windows Server 2016 came the hope and promise of improved backups for Hyper-V environments. And indeed Microsoft delivered on that and has given us faster, more scalable and more reliable backups. With VHD sets also came the promise of host based backups for guest clusters.

The problem is that this promise or, as it is perhaps better to be mild and careful, that expectation has not been met. Decent, robust host based backups of guest clusters in Windows Server 2016 are still not a reality. For me this means it blocked a few scenarios and we’re working on alternatives. This is a missed opportunity I think for MSFT to excel at virtualization.

The problem

Doing host based backup of guest clusters with VHD Set disks is supported in Windows Server 2016 under certain conditions.

At RTM it became clear that CSV inside the guest cluster was not supported.

You need a healthy cluster with all disks one line

These requirements are reflected in Errors discovered during backup of VHDS in guest clusters

Error code: ‘32768’. Failed to create checkpoint on collection ‘Hyper-V Collection’

Reason: We failed to query the cluster service inside the Guest VM. Check that cluster feature is installed and running.

Error code: ‘32770’. Active-active access is not supported for the shared VHDX in VM group

Reason: The VHD Set disk is used as a Cluster Shared Volume. This cannot be checkpointed

Error code: ‘32775’. More than one VM claimed to be the owner of shared VHDX in VM group ‘Hyper-V Collection’

Reason: Actually we test if the VHDS is used by exactly one owner. So having 0 owner also creates this error. The reason was that the shared drive was offline in the guest cluster

Unfortunately, this is not the only problems people are facing. Quite often the backup software doesn’t support backing up VHD Sets or when it does they fail. Some of those failings like being unable to checkpoint the VHD Set have been addressed via Windows Updates. But there are others issues.

Let’s look at the two most common ones.

Issue 1

You can make one backup an all subsequent backups fail. This is due to the avhdx files being in used and locked. This means that as long as the cluster is up and running the recovery checkpoint chain keeps growing. This can be “cleaned” or merged but only by taking down the cluster.

At the first backup live seems good.

image

The recovery checkpoint as a collection is indeed working.

clip_image003

All attempts at another backup fail.

clip_image005

Shutting down all cluster VMs and starting them up again does merge the recovery checkpoints.

Issue 2

You can make backups, successfully but the recovery checkpoints never get merged. clip_image007

This sounds “better” but it isn’t. There is no way to merge the checkpoint. Manually merging the checkpoints of a VHD Set is bad voodoo.

Both situations get you into problems and I have found no solution so far. At the time or writing I’m back at the “never ending” recovery checkpoint chain situation. But that can change back to the 1st issue I guess. Sigh.

I have found no solution so far

For now I have been unable to solve these problem. There is no fix or even a workaround. The only to get out if this stale mate is to shut down every node of the guest clusters and then restart them all. Just a restart of the guest nodes of the cluster doesn’t do the trick of releasing the checkpoints files and merging them. While this allows you to take one backup successfully again, the problem returns immediately. For you reference that was my issue with the October 2017 CU (KB)

The other scenario we run into is that the backups do work but the recovery checkpoints never ever merge. Not even when you shut down the all the guest VM cluster nodes and start them. With frequent backup that turns into a disaster of a never ending chain of recovery checkpoints. This is actually the situation I was in again after the November 2017 updates on both guests & hosts (KB4049065: Update for Windows Server 2016 for x64-based Systems and KB4048953: 2017-11 Cumulative Update for Windows Server 2016 for x64-based Systems).

To me this situation is blocking the use of guest clustering with VHD Sets where a backup is required. For many reasons we do not wish to go the route of iSCSI or vFC to the guest. That doesn’t cut it for us.

Conclusion

Host level backups of guest clusters in Windows Server 2016 are still a no go. This despite the good hopes we had with VHD Sets to address this limitation and which we were eagerly awaiting. For many of us this is a show stopper for the successful virtualization guest clusters. Every month we try again and we’re not getting anywhere. Hence the frustration and the disappointment.

More than 1 year after Windows Server 2016 RTM we still cannot do consistent host level backup a Hyper-V guest cluster, not even those without CSV, but also not those with standard clustered disks. Trust me on the fact that many of us have given this feedback to Microsoft. They know and I suggest you keep voicing your concerns to them in order to keep it on their radar screen and higher on the priority list. You can do this by opening support calls and by asking for it on user voice. Please Microsoft, we need these workloads to be first class citizens. I’m clearly not the only unhappy camper out there as noticeable in various support forums: Cannot create checkpoint when shared vhdset (.vhds) is used by VM – ‘not part of a checkpoint collection’ error and Backing up a Windows Failover Cluster with Shared vhdx?

Veeam Vanguard nominations are now open for 2018!

Just a quick blog post on the Veeam Vanguard program. The nominations for 2018 are open! That means that if you know people who would make a Veeam Vanguard you can nominate them. You can even nominate yourself, that’s perfectly fine. It’s not frowned upon, but it also doesn’t change anything in terms of evaluation for the program.

veeamvanguardnewlogo

Rick blogged on this yesterday on the Veeam blog in “Veeam Vanguard nominations are now open for 2018!” and gave some more insight of what the program is, tries to achieve and does. He also discusses the selection. The key take-away is that you cannot study for this and that it is not some kind of certification or such. Some of the current Vanguards were quoted on how they look at the program and one thing is constant in that. The fact that the people in these programs are contributors to the global tech community and it’s about sharing and helping others getting the best out of their environment and their investment in Veeam. It also helps Veeam as they get a very communicative group of people to give them feedback on their offerings, both products and services. It’s just one more tool that helps them get things right of fix thing when they got it wrong. Likewise understanding Veeam and their products better for us helps us make better decisions on design, implementation and operation of them.

You can have a look at the current lineup of Veeam Vanguards over here.

clip_image004

You’ll find a short video on the program on that page as well. So go meet the Vanguards and find their blog, their communities and follow @VeeamVanguard and the hash tag #VeeamVanguard to see what’s going on.

clip_image006

So, people, this is the moment if you want to nominate someone or yourself to join the Veeam Vanguards in 2018. You have time until December 29th 2017 to do so. I have always felt honored to be selected and have found memories of the events I was able to go to and I to this day I’m happy to be active in the Veeam Vanguard ecosystem. It’s a fine group of professionals in a program of a great company.

Testing Azure File Sync

Introduction

Since Azure File Sync is now out in public preview I can chat a bit about why I am interested in it. My interest is big enough for me to have tested it in private previews even when managers weren’t really interested to do so yet. It was too early for them and it certainly didn’t help that, in many scenarios, it involves “hardware” involved in some way or another. In cloud first world that’s bad by default even if that’s only a hypervisor host and storage.

But part of my value is providing solutions that serve the clear and present needs of the business today and tomorrow. All this without burdening them with a mortgage on their future capabilities to address the needs they’ll have then. In that aspect of my role I can and have to disagree with managers every now and then. Call it a perk or a burden, but I do tell them what they need to hear, not what they want to hear. So yes, I was and I am testing Azure File Sync. You have to remember that what’s too early in technology today, is tomorrows old news. To paraphrase Ferris Bueller, technology moves pretty fast. If you don’t stop and look around once in a while, you could miss it.

If you need to get up to speed on Azure File Sync fast, take a look at https://azure.microsoft.com/nl-nl/resources/videos/azure-friday-hybrid-storage-with-azure-file-sync-langhout/.

clip_image002

I’m not going to discuss all of the current or announced capabilities of Azure File Sync here. There is plenty of info out there and the Program Managers are very responsive and passionate when it comes to answering you detailed questions. There’s plenty of more info in the Microsoft Ignite 2017 Sessions: https://myignite.microsoft.com/sessions/54963 and https://myignite.microsoft.com/sessions/54931.

What drove me to Testing Azure File Sync

I was in general unhappy with the StorSimple offering and other 3rd party offerings. There, I’ve said it. It was too much of a hit or miss, point solution and some of them are really odd ones out in an otherwise streamlined IT environment. I’ll keep it at that. While there are still other players in the market we have another concern. That  is that at the cloud side of things we needed integration and capabilities a 3rd party just wouldn’t have the leverage for to achieve with Microsoft. The reality is that unless you are talking about cloud in combination with huge amounts of money you will not get their attention. It’s business, not personal. Let’s look at the major reasons for me to investigate Azure File Sync.

The need to access data from different locations in different ways

First of all, there is the need to access data from different locations in different ways. That is both on-premises and from Azure. In the cloud we access file share via REST APIs and that works fine for well controlled and secured application environments and services written in and for the modern workplace.

When we start involving human beings that falls apart. The security model with roles and responsibilities doesn’t map well into an access token you paste into a mapped drive of an Azure file share. Then there is the humongous amount of applications that cannot speak REST. The most often can speak NFTS and SMB which is ruled by ACLs. They are children of a client/server world and as such, when designed and written well, they shine on premises in a well-connected low latency world. Yes, that very long tail of classical applications …

clip_image004

Windows Applications and their long, long tail by TeamRGE.com as frequently found in RDS and Citrix VDI articles.

In cloud based RDS solutions and IAAS we also use SMB and NFTS/ACL to consume data, do don’t assume everything in the cloud is pure REST and claims based security.

I won’t even go into the need to have ACL on Azure File share, the options and need for extending AD to Azure over a VPN, ADFS, ADFS Proxies, Azure AD Pass-through, Azure AD Sync, … the whole nine yards to cover all kinds of scenarios. Sure, I hear you, AD will go away, it doesn’t matter anymore. On a long enough time scale such statements are always true. But looking at Office 365 success I’m not howling with the crowd e-mail is totally obsolete yet either. Long tails …

Anyway, when you need to consume the same data from different locations in different ways it often leads to partial or complete copies of data that might change in different places and needs to be protected in different place. That’s a lot of overhead to deal with and consistency can become a serious issue. But with this we have touched on our second driver.

Backups, data protection & recovery

Yup, good old backups. You know the thing too many of the mediocre IT managers out there is nothing to it anymore as they solved that problem over a decade ago. But time doesn’t stand still. Plain and simple. I sometimes deal with largish file server clusters leveraging SMB3 (and ODX when available). The LUNs for those files shares are generally anything between 8 TB and 20 TB totaling 100TB and upwards per cluster. There are dozens of LUNs on any cluster with tens of thousands of folders and hundred million of files. DFS namespace helps us keep a unified namespace for the consumers. Since Windows Server 2012, that’s ages ago, we have the technology to make this a breeze. SMB 3 with continuous availability and leveraging SMB3/ODX wherever we can and makes sense. Thank you, Microsoft!

One of the selling points for Azure File Sync is dealing with storage capacity issues. That’s the least of my worries actually. We have never struggled with a lack of storage on file servers. For crying out loud, we solved that problem well over a decade ago. Yes, even on premises. So that’s not our main motivation here.

What is still not covered well enough for us is backups. That number of files, not just the data volume, is a challenge. Not just the backup target storage for it and the infrastructure. Basically, that can be built performant enough and reasonably cost effective. The big challenge with those numbers is the time it takes to deal with backups.

We have had to revert to application consistent storage snapshots and replicas of those on campus and between cities to make help deal with the fact that “traditional backup” doesn’t really scale well to those volumes. That’s great but when data is valuable the 3-2-1 rule is set in stone. So, you need more. A lot more actually perhaps as today you might want to supplement it with extra rules to include air gapping and encryption. Got ransomware anyone?

Traditional backups with agent on the physical host or in the guest are not cutting it. When you virtualize, you can leverage on host based backups of entire VMs or split it up in host bases backups of the OS, the data virtual disks etc. It’s a lot more effective to backup number of 5/15/25 TB sized VHDX files than it is to so with in guest or in server backups. Even more so when you combine this with modern change block tracking and synthetic full backups and health check/repair capabilities in backup products.

The challenge was that with true virtualized clusters, meaning shared VHDX, the backup story wasn’t a 1st class citizen. Which made people either not do it or use iSCSI, vFC to the guest and deal with the backups as if it were physical machines, losing any benefits associated with host based backups of VMs. Things have gotten better with Windows Server 2016 so we had two paths to investigate to deal with these challenges.

There are 2 approaches to data protection we are investigating

There are 2 approaches to data protection we are investigating and solutions might be a combination of them.

Virtualize it all and then backup the VM. In other words, leverage the power of host level backups with virtual machines. Now things have become a whole lot better for guest cluster backups with Windows Server 2016 but we are not quite there yet in order to call them 1st class citizens. The other concern is that the growth in data will still beat any progress we can make in speed and reliability with backups. Another concern is that people don’t want to deal with storage and are looking for storage as a service. The easiest and smartest way to achieve this is to look a public cloud offerings that delivers what’s needed and not take on the burden of managing 3rd parties to deliver this.

The first problem might be covered by Veeam Backup & Replication v10 that will allow for backing up file shares. As long as shared VHDX or VHD Sets are not full blown equal citizens in Hyper-V when it comes to backups we’ll look at another solution.

That’s fine, but what about pure volume and that 3-2-1 (multiple copies, different technologies, different locations) rule or better? The cost and overhead with a second site can be a challenge for some. Can Azure File Sync help there? It sure can. But the big bonus is that is also a big step forward into helping with the need to access data from different locations in different ways

The problem that we might still have is that we are dependent on the same type of solution if its geo-replicated. Application consistent snapshots, storage array or Windows VSS based, that are geo replicated are a very important part of our protection against data loss. That have become even more obvious with the ever-growing threat of ransomware. In this threat model you try to separate access to data and backups both logically and physically as well as technology and operations wise. Full “air gapping” is the Walhalla here. Lacking that you need at least serious delaying factors to prevent an attacker to gain access to both source data, snapshot on Windows or on storage arrays, backup targets and replicas of all the above.

We could actually use Azure File Sync as a primary of secondary backup target and combine the best of both worlds. Interesting huh!

The fact that we can have snapshots of Azure File Shares is important to us for data recovery. It’s also important to have those replicated in case the Azure Datacenter West Europe goes south so to speak . The next question is whether our needs and wants are covered today in the preview?

Are all our needs and wants covered today in the preview?

The question we need to answer does it cover the needs you have well enough or better. Today Azure File Sync isn’t generally available and when it is it will be v1. So, things can and will improve.

Let’s forget of LUN size limitations (5TB) for the moment or some of our other future capability and functionality requests on the wish list. I don’t need 64TB or bigger per se but I’d like to see at least 20 to 30TB LUNs supported. Not sure if Azure File Sync snapshots will be limited by VSS to 64TB or that they break through that barrier like we can with certain hardware VSS providers.

Today we limit the size of LUNs for repair as well as data protection and recovery speeds or to minimize impact of a downed LUN. When the data is in the cloud and “only” a cache on premises these concerns might subside more and more. At Ignite 2017 Microsoft mentioned +/- 100TB sizes. So, there is an indication of that 64TB barrier being broken somehow.

The one thing we are not getting yet with Azure File Sync in preview is the multiple copies on two different technologies in two different locations (remember that 3-2-1 rule). You are and remain dependent on the Microsoft solution and maybe adherence to that rule is built into their technological design, maybe it isn’t. I don’t know right now. We have LRS for now, but not GRS. Do note that things change rather fast in the loud and what isn’t here today might be there tomorrow. GRS or ZRS is on the roadmap to be available when Azure File Sync goes into production.

clip_image006

Also, Microsoft realized that having snapshots in the same storage account was a risk. So, the backups will go to the recovery services vault in the future. It will also provide for long term data retention. This is important for archiving and authentic data that cannot be reproduced and as such has to be secured against altering.

Maybe you don’t trust MSFT or the cloud for 100%. If that’s the case and it’s a hard policy requirement of the business to have a protected data copy that is not dependent on the cloud provider we could propose a solution at a certain cost. You can use a file share cluster or, cheaper, a single file server in the datacenter (one location only) where everything is pinned locally (no cloud tiering) and which is used as a backup source to a backup target on different storage. The challenge there again is that you still have to deal with that massive volume of files yourself. If that’s worth the price/benefit versus the risks is not for me to decide. Yes, no, maybe so, that’s not. I deal in technological solutions and I buy and build results technological results for the needs of the organization. I do this in the context and in the landscape in which that business operates. I do not buy or build services, let alone journey’s. That’s the part I do and I do it very well . In my opinion Azure File Synch will play a major part in all this.

Other things on the wish list are:

  • Global file lock mechanism and orchestration so with an error handling model (Try, Catch, Finally) so automation in different sites can deal with locked files in the application logic helping with the challenges of distributed access (read/write/modify) to data. I’m not asking to reinvent SharePoint here, don’t get me wrong but this area remains a challenge. Both for the user of data as well as for those trying to solve those challenges in code. There are just so many applications working in different ways that perfection in this arena will not be of this world and any solution will require some tweaking and flexibility to cater to differ needs. But I think that an effort here could make a real value proposition.
  • Bringing ACL to Azure File Sync might enable new scenarios as well. It’s been on the wish list for a while and while MSFT is looking in to it we don’t have it yet.
  • Support for data deduplication. Kind of speaks for itself and both customer & Microsoft would benefit.
  • I’d love to see support for ReFSv3 as well in time because I’m still hoping Microsoft will bring the benefits of ReFSv3 too many more use cases than today. I think they got my feedback on that one loud and clear already.

So far for some of our musing that lead us to testing Azure File Sync. We’ll share more of our journey and become a bit more technical as we progress in our journey.