Shared VHDX In Windows 2016: VHDS and the backing storage file

Introduction into the VHD Set

I have talked about the VHD Set with a VHDS file and a AVHDX backing storage file in Windows Server 2016 in a previous blog post A first look at shared virtual disks in Windows Server 2016. One of the questions I saw pass by a couple of times is whether this is still a “normal VHDX” or a new type of virtual disk. Well the VHDS files is northing but a small file containing some metadata to coordinate disk actions amongst the guest cluster nodes accessing the shared virtual disk. The avhdx file associated with that VHDS file is an automatically managed dynamically expanding or fixed virtual disk. How do I know this? Well I tested it.

There is nothing that preventing you from copying or moving the avhdx file of a VHD Set that not in use. You can rename the extension from avhdx to vhdx. You can attach it to another VM or mount it in the host and get to the data. In essence this is a vhdx file. The “a” in avhdx stands for automatic. The meaning of this is that an vhdx is under control of the hypervisor and you’re not supposed to be manipulating it but let the hypervisor handle this for you. But as you can see for yourself if you try the above you can get to the data if that’s the only option left. Normally you should just leave it alone. It does however serve as proof that the VHD Set uses an standard virtuak disk (VHDX) file.

I’ll demonstrate this with an example below.

Fun with a backing storage file in a VHD Set

Shut down all the nodes of the guest cluster so that the VHD Set files are not in use. We then rename the virtual disk’s extension avhdx to vhdx.

image

You can then mount it on the host.

image

And after mounting the VHDX we can see the content of the virtual disk we put there when it was a CSV in that guest cluster.

image

We add some files while this vhdx is mounted on the host

image

Rename the virtual disk back to a avhdx extension.

image

We boot the nodes of the guest cluster and have a look at the data on the CSV. Bingo!

image

I’m NOT advocating you do this as a standard operation procedure. This is a demo to show you that the backing storage files are normal VHDX files that are managed by the hypervisor and as such get the avhdx extension (automatic vhdx) to indicate that you should not manipulate it under normal circumstances. But in a pinch, it a normal virtual disk so you can get to it with all options and tools at your disposal if needed.

WEBINAR: Troubleshooting Microsoft Hyper-V – 4 Tales from the Trenches

I’ve teamed up with Altaro as a guest in one of their Hyper-V webinars to discuss trouble shooting Microsoft Hyper-V. I’ll be joining my fellow Microsoft Cloud and Datacenter Management  Andy Syrewicze for this in a virtual cross Atlantic team

WEBINAR: Troubleshooting Microsoft Hyper-V - 4 Tales from the Trenches

We’ll give you some pointers on good practices and where to start when things go south. We’ll also add some real world examples to the mix to spice things up. As you can imagine these issues are often a lot more fun after the facts and after having solved them. If you work in IT long enough you know that one day (or night) trouble will come knocking and that the stress related with that can be gut wrenching.

The good news is that it isn’t the end of the world. In well designed and managed environment you can minimize downtime and you do not have to suffer through unrecoverable losses as long as you’re well prepared. We hope this webinar will help the attendees prevent those problems in their environment. If not, it will provide you with some insight on how to prepare for and handle them.

The webinar is on February 25th, 2016 at 4pm CET / 10am EST!

The time has been chosen to make it feasible for as many many people across all time zones to attend. So, go on, register, it’s free and both Andy and I are looking forward to sharing our trouble shooting tales from the trenches.

webinar-Join-button-troubleshooting-hyperv

Confusing Mellanox Windows PerfMon Counters

Introduction

So you start out doing SMB Direct. Maybe you’re doing RoCE, if so there’s a good chance you’ll be using the excellent Mellanox cards. You studied hard, read a lot and put some real effort into setting it up. The SMB Direct / DCB configuration is how you think it should be and things are working as expected.

Curious as you are you want to find out if you can see Priority Flow Control work. Well, the easiest way to do so is by using the Windows Performance Monitor counters that Mellanox provides.

Confusing Mellanox Windows PerfMon Counters

So you take your first look at the Mellanox Adaptor QoS Perfmon counters for ConnectX series for SMB Direct (RDMA) traffic. When you want to see what’s happening in regards to pause frames that have been sent and received and what pause duration was requested from the receiving hop (or received from the sending hop) you can get confused. The naming is a bit counter intuitive.

clip_image002

The Rcv Pause duration is not the duration requested by the pause frames the host received, but by the pause frames that host sent. Likewise, the Sent Pause duration is not the duration requested by the pause frames the host send, but by the pause frames that host received.

clip_image004

So you might end up wondering why your host sends pause frames but to only see the Rcv Pause duration go up. Now you know why Smile.

Now there were plans to fix this in WinOF 4.95. The original release note made mention of this and this made me quite happy as most people are confused enough when it comes to RDMA/RoCE/DCB configurations as it is.

A screenshot of the change in the original Mellanox WinOF VPI Release Notes revision 4.95

clip_image005

Unfortunately, this did not happen. It was removed in a newer version of these release notes. My guess is it could have been a breaking chance of some sort if a lot of tooling or automation is expecting these counter names.

I still remember how puzzled I looked at the counters which to me didn’t make sense and the tedious labor of empirical testing to figure out that the wording was a bit “less than optimal”.

But look, once you know this you just need to keep it in mind. For now, we’ll have to live with some confusing Mellanox Windows PerfMon counter names. At least I hope I have saved you the confusion and time I went through when first starting with these Mellanox counters. Other than that I can only say that you should not be discouraged as they have been and are a great tool in checking RoCE DCB/PFC configs.

Maximum bandwidth in Hyper-V storage QoS policies

Introduction

In a previous blog post Hyper-V Storage QoS in Windows Server 2016 Works on SOFS and on LUNs/CSV I have discussed Storage QoS Policies in Windows Server 2016. I have also demonstrated this in a lab setup at VEEAMON 2015 in one of my talks at the Microsoft presentation area. It’s one of those features where a home lab will do the job. There is no need for special storage hardware. It’s all in box functionality. Cool!

Maximum bandwidth in Hyper-V storage QoS policies

Now that was in the Technical Preview 2 and 3 era, where it all revolved around minimum and maximum QoS. In Windows Server 2016 Technical Preview 4 we got some new features in regards to storage QoS policies. One of those is that we can now also set the Maximum bandwidth on a policy using the parameter MaximumIOBandwidth. This parameter, which is set in bytes per second determines the maximum bandwidth that any flow assigned to the policy is allowed to consume.

image

We use that policy ID to assign it to the 2 shared virtual disks of our cluster nodes. You’ll need to do this for all of the guest cluster nodes.image

You can copy the PoSh demo script below


#Create a Storage Policies
$DemoVMPolicy = New-StorageQosPolicy -Name DemoVMPolicy -PolicyType MultiInstance `
-MinimumIops 250 -MaximumIops 500 -MaximumIOBandwidth 100MB

#Look at our storage Policies
Get-StorageQosPolicy -name DemoVMPolicy

#Grab our policy ID
$DemoVMPolicy = (get-StorageQosPolicy -Name DemoVMPolicy).PolicyId 
$DemoVMPolicy 


#Look at our VMs policy setting before and after assigning a storage policy.
#We assign the storage policy to the 2 shared virtual disks
#that are located a location 1 and 2 on SCSI controller 0

Get-VM -Name GuestClusterNode1 | Get-VMHardDiskDrive |
ft Path,MinimumIOPS, MaximumIOPS, MaximumIOBandwidth, QoSPolicyID -AutoSize

Get-VM -Name GuestClusterNode1 | Get-VMHardDiskDrive | Where-Object {$_.controllerlocation -ge 1}|
Set-VMHardDiskDrive  -QoSPolicyID $DemoVMPolicy

Get-VM -Name GuestClusterNode1 | Get-VMHardDiskDrive | 
ft Path, MinimumIOPS, MaximumIOPS, MaximumIOBandwidth, QoSPolicyID -AutoSize

You can use MaximumIOBandwidth by itself or you can combine it with the maximum IOPS setting. When both of these parameter are set in a storage QoS policy they are both active. The one that is reached first by a flow assigned to this policy will be the limiting factor in the I/O of that flow.

As an example. Let’s say you specify 500 IOPS and 100Mbps bandwidth as maxima. Your workload hits 500 IOPS but only consumes 58 Mbps it’s the IOPS that are limiting the flow.