Windows Data Deduplication and Cluster Operating System Rolling Upgrades

Introduction

Have you considered when Windows data deduplication and cluster operating system rolling upgrades from Windows Server 2012 R2 to Windows Server 2016 Clusters are discussed we often hear people talk about Hyper-V or Scale Out File Server clusters, sometimes SQL but not very often for a General-Purpose File Share server with continuous availability. Which is the kind I’ve done quite a number of actually.

clip_image002

Being active in an industry that produces and consumes file data in large quantities and sizes we have been early implementers of Windows cluster for General-Purpose File Shares with continuous availability. This provides us the benefits of SMB 3, ODX for both the clients and IT Operations for workloads that are not suited for Scale Out File Server deployments.

As such we have dealt with a number of Windows 2012 R2 GPFS with GA cluster that we wanted to move to Windows Serer 2016. Partially to keep the environment up to date and partially because we want to leverage the new Windows deduplication capabilities that this OS version offers. The SOFS and Hyper clusters that I upgraded didn’t have data deduplication enabled.

The process to perform this upgrade is straight forward and has been documented well by others as well as by me in regards to issues we saw in the field . We even dove behind the scenes a bit in Cluster Operating System Rolling Upgrade Leaves Traces. I have also presented on this topic in public at conferences around Europe (Ireland, Germany and Belgium) as part of our community contributions. No surprises there.

Test your assumptions

This is a scenario you can perform without any downtime for your clients when all things go well. And normally it should. I have upgraded a couple of Scale Out File Server (SOFS) and General-Purpose File Server (GPFS) cluster with Continuous availability now and those went very well. Just make sure your cluster is perfectly healthy at the start.

Naturally there are some check you need to make that are outside of Microsoft scope:

I’m pretty sure you have good backups for your file data and you should check this works with Windows Server 2016 and how it reacts during the upgrade while the server is in mixed mode. Perhaps you will or won’t be able to run backups or restore data. Check and know this.

Verify your storage solution supports and words with Windows Server 2016. It sounds obvious but I have seen people forget such details.

Another point of attention is any Anti-Virus you might have running on the file server cluster nodes. Verify that this is fully supported on Windows Server 2016. On top of that validate that the Anti-Virus still works well with ODX so you don’t run into surprises there. Don’t assume anything.

Check if the server and it components (HBA, NICs, BIOS, …) its firmware and drivers support Windows Server 2016. Sure, the rolling upgrade allows for some testing before committing but that doesn’t mean you should go ahead blindly into the unknown.

Make sure your nodes are fully patched before and after the upgrade of a cluster node.

As the file server cluster is already leveraging SMB 3 with continuous availability al the prerequisites to make that work are already take care of. If you are upgrading a File server cluster without continuous availability and are planning to start using this, that’s another matter and you’ll need to address any issues. You can do this before or after moving to Windows Server 2016. This means you’d move to a solution before you upgrade or after you have performed the upgrade to Windows Server 2016.

You can take a look at my blogs on this subject from the Windows 2012 R2 time frame such as More Tips On Dealing With Removing Short File Names When Migrating To a SMB3 Transparent Failover File Server Cluster, Migrate an old file server to a transparent failover file server with continuous availability and SMB 3, ODX, Windows Server 2012 R2 & Windows 8.1 perform magic in file sharing for both corporate & branch offices

Data deduplication takes some extra consideration

I have blogged before on how Windows Server 2016 Data Deduplication performs and scales better than it did it Windows Server 2012 R2. This also means that it works at least partially different than it did Windows Server 2012 R2. You can see this in some of the updates that came out in regards to a data corruption bug with data deduplication which only affected Windows Server 2016.

clip_image003

Given this difference, what would happen if you fail over a LUN with deduplication enable from Windows Server 2012 to Windows Server 2016 and vice versa? That’s the question I had to consider when combining Windows data deduplication and cluster operating system rolling upgrades for the first time.

Windows Server 2016 is backward compatible and will work just fine with a LUN that from and Windows Server 2012 Server that has Windows data deduplication enabled. The reverse is not the case. Windows Server 2012 R2 is not forward compatible. When dealing with data deduplication in an Operating System Rolling Upgrade scenario I’m extra careful as I cannot guarantee any LUN movement scenario will go well. With a standalone server

Once I have failed over a LUN to Windows Server 2016 node in a mixed cluster I avoid moving it back to a Windows Server 2012 R2 node in that cluster. I only move them between Windows Server 2016 nodes when needed.

I move through the rolling upgrade as fast as I can to minimize the time frame in which a LUN with data deduplication could end up moving from a Windows Server 2016 o a Windows Server 2012 cluster node.

Should I need to reverse the Operating System Rolling Upgrade to end up with a Windows Server 2012 R2 cluster again I’ll make absolutely sure I can restore the data from LUNs with data deduplication from backup and/or a snapshot from a SAN or such. You cannot guarantee that this will work out fine. So be prepared.

For “standard” non deduplicated NTFS LUNs you can fail back if needed. When data deduplication is enabled you should try to avoid that and be prepared to restore data if needed.

Final advise is always the same

Even when you have tested your upgrade scenario and made sure your assumptions are correct you must have a way out. And as always, “One is none, two is one”.

As always during such endeavors you need to make sure that you have a roll back scenario in things do not work out. You must also have a fail back plan for when things turn really bad. For most scenarios has the ability to return to the original situation built in. But things can go wrong badly and Murphy’s Law does apply. So also have the backups and restore verified just in case.

The last thing you need after a failed upgrade is telling your customer or employer “it almost worked” but unfortunately, they’ve lost that 200TB of continuous available data. Better next time doesn’t really cut it.

Quick Fix Publish : VM won’t boot after October 2017 Updates for Windows Server 2016 and Windows 10 (KB4041691)

If you had WSUS (or SCCM) running tonight with auto approval on you might have woken up this morning to virtual machines that can boot anymore.

image

Great, another update gone wrong. Time to restore from backup as that can be the fasted way to restore services when in a pickle and if you have a good solutions for that in place. For the others you can do what I did is below. Actually a couple of us MVPs were on this issue at a number of sites as our fist task this morning. But first the root cause.

Well read this link Express update delivery ISV support and you have all you need. Basically the delta and the full cumulative update of October (KB4041691 – https://support.microsoft.com/en-us/help/4041691)  ended up in WSUS without you explicitly putting it there. That should not happen, normally the delta is not published for it to be downloaded and heaven forbid auto approved.  You could also have manually approved everything without really knowing what and why. Not a great idea at all.

image

So your VM get’s offered both of them and that is BAD!

image

Normally you get into this pickle if you some how managed to install both of these yourself or via other tools (see the link above), which you shouldn’t do.

Now if you don’t have decent restore capabilities from backups or snapshots there is another way out by removing the updates.

Boot into the problematic VM and select troubleshoot

image

Select to open the command prompt and stay away from any other auto repair options.

image

Microsoft advises to get rid of the SessionsPending reg key. To do so load the software registry hive as follows:

reg load hklm\temp c:\windows\system32\config\software

Delete the SessionsPending registry key, if it exists by running:

reg delete “HKLM\temp\Microsoft\Windows\CurrentVersion\Component Based Servicing\SessionsPending” /v Exclusive

Unload the software registry hive:

reg unload HKLM\temp

Run dism /image:c:\ /get-packages to find the updates installed that caused the issue

image

The yellow one are the ones of interest and you can see the first one never even got an install time/

We now use DISM to remove these updates.  Do first create the C:\Temp folder with MD temp if it doesn’t exist yet!

dism /image:c:\ /remove-package /packagename:myproblematicpackagetoremove /scratchdir:c:\temp

image

When done, close the command prompt, shut down the VM and then start it.

image

It will take a while but if will succeed and you’ll be greeted by a logon screen. Good luck!

Important: Do not try any other repair options or removing the updates with DISM might fail. We choose to remove all 3 updates from tonight to make sure. It might suffice to remove the delta one alone but we wanted to have an VM back as it was last night so more testing can be done before it is deployed again.

So, basically, don’t auto approve updates blindly, but test, validate & roll out in phases. Have great backup and TESTED restores. All by all we were only bitten in the lab, a couple of test/dev VMs and some of our infra VMs. Most of these are redundant and are patched stagger so our services were never badly effected. That gave us time to trouble shoot and investigate and warn our colleagues. As you can see here the issue was a delta update that made it into WSUS and was installed together with the full CU. Just manually downloading the CU and testing it would not have given you the heads up. About an issue. This is a reminder you need to test your real live situation and processes as realistically as possible. When you’re done with testing and cleaning up any fallout of this issue, make sure to patch your systems again!

Update: this also goes for Windows 10 Updates

Also see fellow MVP Mikael Nystrom blog post  https://deploymentbunny.com/2017/10/11/the-october-2017-update-inaccessible-boot-device/

Update: we now also have the official MSFT response & fix for each and every scenario right here https://support.microsoft.com/en-us/help/4049094/windows-devices-may-fail-to-boot-after-installing-october-10-version-o

Windows Server 2016 RDMA and the Hyper-V vSwitch – Part II

Introduction

In part I this article I demonstrated that some of the rules in regards to SMB Direct and the Hyper-V vSwitch as we know them for Windows Server 2012 R2 have changed with Windows Server 2016. We focused on the fact that you can expose RDMA to a vNIC exposed to the management OS created on a vSwitch. This means that while in Windows Server 2012 R2 you cannot expose RDMA capabilities via a vSwitch, even when you are using a non-teamed RDMA capable NIC, this is no longer true with Windows Server 2016.

While a demo with a vSwitch on a single NIC as we did in part I is nice it’s unlikely you’ll use this often if at all in the real world? Here we require redundancy and that means NIC teaming. To do so we normally use a vSwitch created on a native Windows NIC team. But a native NIC teaming does not expose RDMA capabilities. And as such a vSwitch created against a Windows native NIC team cannot leverage RDMA either. Which was the one of the reasons why a fully converged scenario in Windows Server 2012 R2 was too limited for many scenarios. Loss of RSS on the vNIC exposes to the management OS was another. The solution to this in Windows Server 2016 Hyper-V comes with Switch Embedded Teaming (SET). Now using SET in each and every situation might not be a good idea. It depends. But we do need to know how to configure it. So let’s dive in.

Switch Embedded Teaming (SET) exposes RDMA to the vSwitch

Switch Embedded Teaming (SET in Windows Server 2016 allows multiple identical (make, model, firmware, drivers to be supported) NICs to be used or “teamed” within the vSwitch itself. The important thing to note here this does not use windows NIC teaming or LBFO (Load Balancing and Fail Over).

SET is the future and is needed or use with the Network Controller and Software Defined Networking in Windows. SET can also be used without these technologies. While today it supports a good deal of the capabilities of native Windows NIC teaming it also lacks some of them. In general SET is meant for full or partial converged scenarios with 10GBps or better NICs, not 1Gbps networking in a (hyper)converged Hyper-V scenario.

Please see New Windows Server 2016 NIC and Switch Embedded Teaming User Guide for Download for more information as there is just too much to tell about it.

Setting it up

We start out with a 2-node cluster where each node has 2 RDMA NICs (Mellanox ConnectX-3) with RDMA enabled and DCB configured. Live migration of VMs between those nodes works over SMB Direct works. All NIC are on the same subnet 172.16.0.0/16 (thanks to Window Server 2016 Same Subnet Multichannel) and are on VLAN 110. In Failover Cluster Manager (FCM) that looks like below.

clip_image002

We’ll now use the rNICs to create a Switch Embedded Team.

clip_image004

Note that the teaming mode is switch independent, the only option supported with SET in Window Server 2016.

clip_image006

This also gives us a vNIC exposed to the management OS (default)

clip_image008

This is also visible as a vNIC in the mamagement OS called “vEthernet (RDMA-SET-vSwitch)”

clip_image010

This will be used to manage the host and to make its purpose clear we’ll rename it.

We’ll create 2 separate management OS vNICs for the RDMA traffic later. For now, we want the HOST-MGNT vNIC to have connectivity to the LAN and for that we need to tag it with VLAN 10.

image

The vNIC actually “inherited” the IP configuration of one of our physical NICs and we need to change that to either DHCP or a correct LAN IP address and settings.

clip_image014

You can use the code below to set the HOST-MGNT vNIC to DHCP

To finalize the HOST-MGNT vNIC configuration we enable priority tagging on it. If we don’t we won’t see any traffic other than SMB Direct tagged at all!

clip_image016

Before we go any further we’ll remove the VLAN tag from the rNICS as we don’t want it interfering with egress traffic being tagged by them or ingress traffic being filtered because it doesn’t match the VLAN ID on the rNICs.

From here on we’ll focus on the RDMA capable vNICs well create and use for SMB traffic.

We create 2 vNIC on the management OS for SMB Direct traffic.

clip_image018

Now these vNIC need an IP address, this can be in the same subnet because we have Windows Server 2016 SMB multichannel.

We than also need to put the vNICs in the correct VLAN. Remember that DCB / PFC priority tagging needs tagged VLAN so carry that priority. Right now, we can see that these are untagged.

clip_image020

So we tag them with VLAN ID 110

clip_image022

We enable jumbo frames on the vNICs. Remember that the physical NICs in the SET have jumbo frames enabled as well.

clip_image024

Normally all traffic that is originated from vNICs gets any QOS values set to zero. There is one exception to this and that’s SMB Direct traffic. SMB Direct traffic gets tagged with its QoS priority and that is not reset to 0 as it bypasses the vSwitch completely. But if we set other priorities on other types of traffic for DCB PFC and or ETS that passes over these vNICs we must enable priority tagging on these NICs as well or they’ll be stripped away.

clip_image026

The association of the vNIC to pNICs is random. This also changes during creation and destruction (disabling NICs, restarting the OS). We can map a vNCI to a particular pNIC. This prevents suboptimal use of the available pNICs and provides for a well know predictable path of the traffic. We do this with the below PowerShell commands.

Finally, last but not least, we should enable RDMA on our two vNICs or SMB Direct will not kick in at all.

Right now, we have it all configured correctly on one node of our 2-node cluster. The SMB network look like this now:

clip_image028

The cluster now looks like below.

clip_image030

We can live migrate VMs over SMB Direct in this mixed scenario where one node has pNICs RDMA NICs, 1 node has SET with vNICs for RMDA.

clip_image032

When looking at this in report mode we clearly see Node-A send SMB Direct traffic (tagged with priority 4, green) over its RDMA enabled SET vNICs to Node-B which still has a complete physical rNIC set up (blue).

clip_image034

As you can see in the screen shots above we now have RDMA / SMB Direct working with SET / RDMA vNICs on one node (Node-A) and pure physical RDMA NICs on the other (Node-B). This gives us bandwidth aggregation and redundancy. To complete the exercise, we configure SET on the other node as well. But it’s clear SET and RDAM will also work in a mixed environment.

We’ll discuss some details about certain aspects of the vNIC configuration in future articles. Things like the why and how of Set-VMNetworkAdapterTeamMapping and the use of -IeeePriorityTag. But for now, this is it. Go try it out! It’s the basis for anything you’ll do with SDNv2 in W2K16 and beyond.

An error occurred connecting to the cluster

An error occurred connecting to the cluster

This morning I woke up to a bunch of failed backup notifications of our trusted Veeam Backup & Replication v9.5 update 2 solution. After 3:30 AM the backups of one particular cluster started failing.

I went to have a look but I could not connect to the 3 node cluster.

image

I logged on to the cluster nodes themselves and did a quick verification of network connectivity, DNS etc. That was all fine. WMI services were running on all nodes but on node 2 and 3 they were not functional.

Cleary we have a WMI issue. And sure enough, no Hyper-V manager available on those 2 nodes but we did have it on the one properly functioning node.

We tested some PowerShell WMI queries (get-wmiobject mscluster_resourcegroup -computer NodeToTest -namespace “ROOT\MSCluster“) to the cluster and this confirmed that WMI was toast on those two nodes.

Fixing the issue

The good news was that all the VMs were all up and running  – a few that had RHS.exe issues – but were still alive pure Hyper-V wise. That explains why they didn’t have any support calls come in. So if we can fix this without causing down time this would be great. To try this we decided to restart the WMI service.

On problematic node 2 this worked. It restarted depending services as well such as Hyper-V Virtual Machine Management, User Access Logging Service, IP Helper and the Veeam Installer Service and the Veeam Hyper-V Integration Service. We got connectivity back via Hyper-V manager but the Failover Cluster manager GUI remained an issue but now only complained about node 3.

image

We wanted to avoid rebooting node 3 to avoid downtime to the VMs. So what we did there is stop the depending services that we could stop. It was vmms.exe that was stuck in shutdown we just killed the process manually with stop-Process -name “vmms” -force
That allowed the WMI service to be restarted. We then started the depending services manually and we got back the connectivity to Hyper-V Manager on node 3.

The Failover Cluster manager GUI could also connect again to the cluster. We checked the cluster for other issues. When done and found OK we live migrated the VMs node per node and did a reboot of every node one by one. This to have cleanly started nodes and to see if any trouble some event were logged during the startup. Normal operations were resumed.

Do note that there is a blog on TechNet about a similar issue but with a different error message. That was caused by missing cluswmi.mof file due to an ill advised use of run mofcomp.exe *.mof. This was not the case here. A reboot of the misbehaving nodes would have done the trick as well (as blogged here Trouble Connecting to Cluster Nodes? Check WMI! ) but we avoided as much downtime as possible here by going the route we did.