KB3063283 Updates the Hyper-V Integration Components for Windows Server 2012 R2 to 6.3.9600.17831

While investigating a backup issue with some VMs I noticed an entry in the VEEAM Backup & Replication logs that the Hyper-V integration components were out of date.

image

This was the case on all the guests on that particular cluster actually. A quick look at the IC version on the host showed them to be at 6.3.9600.17831.

image

Comparing that to the ones in the guest made clear very quickly that those were at 6.3.9600.16384. So lower.

image

A web search for Hyper-V Integration components led us to KB3063283 “Update to improve the backup of Hyper-V Integration components in Hyper-V Server 2012 R2”on their Hyper-V hosts. They keep a tight ship but due to regulations they are normally 3 to 4 months behind in patches and updates. So in their case they only recently installed that update. KB3063283 Updates the Hyper-V Integration Components for Windows Server 2012 R2 to 6.3.9600.17831

So a little word of warning while you are keeping your Hyper-V environment up to date (you should), don’t forget to update the integration components of your virtual machines. A good backup product like Veeam Back & Replication will log this during backups. It might not make the backups fail per se but they have been updated for a good reason. This upgrade  was even specifically for backup related issues so it’s wise to upgrade the virtual machines to this version a.s.a.p..

Musings On Switch Embedded Teaming, SMB Direct and QoS in Windows Server 2016 Hyper-V

When you have been reading up on what’s new in Windows Server 2016 Hyper-V networking you probably read about Switch Embedded Teaming (SET). Basically this takes the concept of teaming and has this done by the vSwitch. Which means you don’t have to team at the host level. The big benefit that this opens up is the RDMA can be leveraged on vNICs. With host based teaming the RDMA capabilities of your NICs are no longer exposed, i.e. you can’t leverage RDMA. Now this has become possible and that’s pretty big.

clip_image001

With the rise of 10, 25, 40, 50 and 100 Gbps NICs and switches the lure to go fully converged becomes even louder. Given the fact that we now don’t lose RDMA capabilities to the vNICs exposed to the host that call sounds only louder to many.  But wait, there’s even more to lure us to a fully converged solution, the fact that we now do no longer lose RSS on those vNICs! All good news.

I have written an entire whitepaper on convergence and it benefits, drawback, risks & rewards. I will not repeat all that here. One point I need to make that lossless traffic and QoS are paramount to the success of fully converged networking. After all we don’t want lossy storage traffic and we need to assure adequate bandwidth for all our types of traffic. For now, in Technical Preview 3 we have support for Software Defined Networking (SDN) QoS.

What does that mean in regards to what we already use today? There is no support for native QoS  and vSwitch QoS in Windows Server 2016 TPv3. There is however the  mention of DCB (PFC/ETS ), which is hardware QoS in the TechNet docs on Remote Direct Memory Access (RDMA) and Switch Embedded Teaming (SET). Cool!

But wait a minute. When we look at all kinds of traffic in a converged Hyper-V environment we see CSV (storage traffic), live migration (all variations), backups over SMB3 all potentially leveraging SMB Direct. Due to the features and capabilities in SMB3 I like that. Don’t get me wrong about that. But it also worries me a bit when it comes to handling QoS on the hardware side of things.

In DCB Priority Flow Control (PFC) is the lossless part, Enhanced Transmission Selection (ETS) is the minimum bandwidth QoS part. But how do we leverage ETS when all types of traffic use SMB Direct. On the host it all gets tagged with the same priority. ETS works by tagging different priorities to different workloads and assuring minimal bandwidths out of a total of 100% without reserving it for a workload if it doesn’t need it. Here’s a blog post on ETS with a demo video DCB ETS Demo with SMB Direct over RoCE (RDMA .

Does this mean a SDN QoS only approach to deal with the various type of SMB Direct traffic or do they have some aces up their sleeves?

This isn’t a new “concern” I have but with SET and the sustained push for convergence it does has the potential to become an issue. We already have the SMB bandwidth limitation feature for live migration. That what is used to prevent LM starving CSV traffic when needed. See Preventing Live Migration Over SMB Starving CSV Traffic in Windows Server 2012 R2 with Set-SmbBandwidthLimit.

Now in real life I have rarely, if ever, seen a hard need for this. But it’s there to make sure you have something when needed. It hasn’t caused me issues yet, but I’m a performance & scale first, in “a non-economies of scale” world compared to hosters. As such convergence is a tool I use with moderation. My testing when traffic competes without ETS is that they all get part of the cake but not super predictable/ consistent. SMB bandwidth limitation is a bit of a “bolted on” solution => you can see the perf counters push down the bandwidth in an epic struggle to contain it, but as said it’s a struggle, not a nice flat line.

Also Set-SmbBandwidthLimit is not a percentage, but hard max bandwidth limit, so when you lose a SET member the math is off and you could be in trouble fast. Perhaps it’s these categories that could or will be used but it doesn’t seem like the most elegant solution/approach. That with ever more traffic leveraging SMB Direct make me ever more curious. Some switches offer up to 4 lossless queues now so perhaps that’s the way to go leveraging more priorities … Interesting stuff! My preferred and easiest QoS tool, get even bigger pipes, is an approach convergence and evolution of network needs keeps pushing over. Anyway, I’ll be very interested to see how this is dealt with. For now I’ll conclude my musings On Switch Embedded Teaming, SMB Direct and QoS in Windows Server 2016 Hyper-V

Remove Lingering Backup Checkpoints from a Hyper-V Virtual Machine

Sometimes things go wrong with a backup job. There are many reasons for this, but that’s not the focus of this blog post. We’re going to show you how to remove lingering Backup checkpoints from a Hyper-V Virtual Machine that where not properly removed after a backup on Windows Server 2012 R2 Hyper-V.

You see checkpoint with a “odd” icon that looks like this

image

You can right click on the checkpoint but you will not find an option to apply or delete it.This is a recovery snapshot and you’re not supposed to manipulate this manually. It’s there as part of the backup process. You can read more about that in my blog posts Some Insights Into How Windows 2012 R2 Hyper-V Backups Work and What Is AutoRecovery.avhdx all about?.

When we look ad the files we see the traces of a Windows Server 2012R2 Hyper-V backup

image

Some people turn to manually merging these checkpoints. I have discussed this process in blog posts 3 Ways To Deal With Lingering Hyper-V Checkpoints Formerly Known as Snapshots and Manually Merging Hyper-V Checkpoints. But you don’t have to do this! As a matter of fact you should avoid this if possible at all. The good news is that this can be done via PowerShell.

I’m not convinced the fact that you can’t do it in the GUI is to be considered a bug as some would suggest. The fact is you’re not supposed to have to do this, so the functionality is not there. I guess this is to protect people from trying to delete one by mistake when they see one during backups.

Anyway, PowerShell to the rescue!

So we run:

Get-VMSnapshot -ComputerName "MyHyperVHost" -VMName "VMWithLingeringBackupCheckpoint"

This show us the checkpoint information (note it does not show you the XXX-AutoRecovery.avhdx” as a checkpoint.

image

And then we simply run Remove-VMSnapshot

#Just remove the lingerging backup checkpoint like you would a normal one via PoSh 
Get-VMSnapshot -ComputerName "MyHyperVHost" -VMName "VMWithLingeringBackupCheckpoint" | Remove-VMSnapshot

You can see the merging process in the GUI:

image

If you inspect the remaining MyVM-AutoRecovery.avhdx file you’ll see that it points to the parent vhdx.Normally your VM is now already running from the vhdx anyway. If not you’ll need to deal with it.

image

During a normal backup that file is deleted when the merge of the redirect AVHDX is done. You’ll need to do that manually here and be done with it.

Conclusion: there is no need to dive in and start merging the lingering backup checkpoints manually. Leave that for those scenarios where that’s the only option you have left.