Do you need hard processor affinity in Hyper-V?

Do you need hard processor affinity in Hyper-V? Good question but let’s set the context first. I tend to virtualize workloads that shock some people. Not because they are super huge solutions requiring Petabytes of storage, 48TB of RAM, 256 cores and a million IOPS. Far from. The shock often comes from people who still consider virtualization as something for the lightweight infra services like DHCP, DNS, WSUS, print servers, or web services and websites. Some of these people even tried to virtualize other services like SharePoint ,SQL, Exchange etc. but they did not take into a account that virtualization is not magic and you’ll need to provision adequate resources and design /manage your environment to do so successfully.  So part of them got bitten. They conclude that performance requires physical deployments … and they want to see a material CPU so to speak.

CPU_closeup

When they see virtual machines with 12 tot 16 vCPUs or > 100GB of memory they seem to thinks that even those workloads are bad candidates to virtualize, let alone even bigger ones. That’s not true by definition. As long as you make sure that you know why (cost/benefits/risks) and how to virtualize it can work. You must provision and allocate the required resources. You must also have the right expertise in both virtualization (servers, storage, networking) and the applications involved (SQL, Exchange, 3rd party  products, …)  along with good operational processes.

You can really virtualize a lot when done right. My “virtual first” approach is rule of thumb and exceptions do exist even when I’m calling the shots. However just like people quoting costs, latency, security, lock-in to question the suitability of Public Cloud versus on premises in “subjective” ways, they do so when it comes to virtualization as well. The discussion if often more about organizational issues, control, fear, politics, interests and money. Every hosting provider out there loves virtualization as it’s great for their TCO/ROI. But when it comes to Public Cloud they’re often less convinced. That “datacenter zero” concept isn’t that attractive to them so we see Hybrid and Public Cloud offerings that might not be that good of an idea in some cases but it fits their interests more. Have you noticed that there are no highly automated, optimized data centers anymore, only * clouds? There are valid use cases for hybrid and private clouds but just like with virtualization, maybe we should let go of the personal/business interests, the fear, and false assumptions when advising customers. It all depends.

In this regards I had several discussions now with people about the lack of hard processor affinity with Hyper-V. This makes it unfit for high performance workloads in their opinion. Sure, such cases do exist. These are however, not the majority. As I’ve  been having this discussion rather often in the past months I wrote an article on the subject that I’ve published in collaboration with StarWind Software Need Hard Processor affinity for Hyper-V? The idea is to reach more people and share insights with the community. Full disclosure: I happen to know Anton Kolomyeytsev (CEO, CTO and Chief Architect at StarWind)  professionally as a fellow MVP and I have great respect for his technical expertise, insights and experience. This made me agree to publish some content via their blog. Sharing opinions and ideas with as many people as possible only makes for better technologist everywhere.

High Availability has a price

We’ll go back to basics today. Some times the obvious, no matter how evident it is to us technologists, is challenged. Recently we got the remark that we were wasting CPU cycles by assigning to many vCPU to certain virtual machines on our Hyper-V cluster. So we had to explain that high availability has a price. On top of that we had to explain that things are not as wasteful as they seem in a virtual environment.

The case

Here’s one of the “offending” virtual machines. They assumed that we are wasting at least 50% of 12 CPUs.

image

This is one node in a dual node  load balancing (active-active) and highly available solution. This provides for zero down time during scheduled maintenance and very little downtime during system failures.

And here’s the second node (yes the 1st node has been down for scheduled maintenance more recently that node 2).

image

In a 2 node HA solution you need to make sure that one node can handle the entire workload. This is the absolute border line of an N+1 solution.  This means you can lose 1 node. N determines the number of nodes needed to guarantee an agreed upon service level and the number defines how many nodes failures can be tolerated before affecting the service.

In the above example there’s a need to have the CPU resources on each node to run the entire workload on one node without having an effect on the service. Therefore, when both nodes are up this might seem like a waste to the uninitiated. It is however a required to achieve the high availability goal. A constant CPU usage over 75 % will lead to a reduction in service quality in this case and even compromise the usability of the that service.

I did not even dive into the dangers of designing purely based on averages during this “explanation”. That was one step to much for the level of the discussion.

It’s also important to note that Hyper-V CPU scheduling is highly intelligent and is far less susceptible to the waste of CPU cycles via over provisioning of vCPU than some other solutions are or used to be. Knowing the capabilities and inner working of the technology used is also important in all this. More nodes generally also make “over provisioning” less of an issue. When you have 10 nodes and you lose 1, you only have lost 10% of the capabilities, not 33% like in a 3 node cluster.

Ideally you have 3 node so that even during an issue with one node you still maintain redundancy. However if you want acceptable services during a 2 node failure you’ll need to go to N+2, meaning that you need 2 nodes to provide the services and handle losing 2 nodes gracefully. In that case you’ll need 4 node and so on.  The larger the node count  the wiser it is to go to a N+2 model and ideally you’ll provide separate failure domains over which the nodes are distributed. An example of this is having a redundant geo-load balanced web farm of 32 virtual machine nodes spread over 2 locations and running on separate hardware failover clusters in each location. As you can see the higher the stakes and demands the faster the cost and potential complexity rises. You can offload some of the complexity by leveraging a public cloud like Azure, but the costs will still be there. There is no such thing as a free lunch, some are quite easy and affordable for what you get.

Conclusion

High Availability has a price. I did mention that already, right? To be able to keep your services running at a level that is both workable and acceptable to your customers and stake holders you will need to over provision to a degree. There is no magic here. When your solutions are being scrutinized by people with no real background, experience and context in high availability you might need to explain this.

Get-VMHostSupportedVersion Windows Server 2016 Hyper-V

Windows Server 2016 Technical Preview 4 has this great PowerShell command to check what virtual machine versions the Hyper-V host supports: Get-VMHostSupportedVersion

When you run this the output on Windows Server 2016 TPv4 is as below.

image

As you can see it supports 5.0 (which translates to the version on Windows Server 2012 R2). That’s logical, it’s the version your migrated VMs will have when you move them to Windows Server 2016 as the VM version is no longer automatically updated. You can read more about this here: https://msdn.microsoft.com/en-us/virtualization/hyperv_on_windows/user_guide/migrating_vms

You must also realize that updating from 5.0 to higher means moving to the newer Virtual machine configuration file format, which is way better for performance and is a binary format and not longer XML.

6.2 is the version a VM had in the Technical Preview 3 era. Which is cool as when you move your VMS between TPv3 and TPV4 they still worked and were supported. Now in reality you want to be either at 5.0 so you can still move VMs back to a W2K12R2 host or at 7.0 to have all new features and capabilities that come with W2K16 at your disposal.

7.0 is also the current default and are the version any newly created VMs or any VM you update via Update-VMVersion will be at.

Now don’t panic, if you need to stay at version 5.0 even for newly created VMs for backward compatibility in mixed Hyper-V Windows Server 2012 R2 / Windows Server 2016 environment. You can specify this with the –version parameter of New-VM

New-VM -Name “IChooseMyVersion-5.0” -Version 5.0

To know what version you can pass to that parameter you only have to look at the output of Get-VMHostSupportedVersion: 5.0, 6.2, 7.0 at the time of writing (on TPv4).

You can see the result of some fun with this in the below screenshots:

image

image

I fully expect Microsoft to keep this up for TPv5 or perhaps even with RTM to allow a smooth transition for VMs running on preview editions to RTM. We’ll see. At RTM time I can see them removing the ability to create a new VM that’s still at 6.2 or so as It’s a “intermediate” work in progress version or so on the way to the final default version number at RTM (see  Windows Server 2016 TPv4 Hyper-V brings virtual machine configuration version 7). We’ll see!

Shared VHDX In Windows 2016: VHDS and the backing storage file

Introduction into the VHD Set

I have talked about the VHD Set with a VHDS file and a AVHDX backing storage file in Windows Server 2016 in a previous blog post A first look at shared virtual disks in Windows Server 2016. One of the questions I saw pass by a couple of times is whether this is still a “normal VHDX” or a new type of virtual disk. Well the VHDS files is northing but a small file containing some metadata to coordinate disk actions amongst the guest cluster nodes accessing the shared virtual disk. The avhdx file associated with that VHDS file is an automatically managed dynamically expanding or fixed virtual disk. How do I know this? Well I tested it.

There is nothing that preventing you from copying or moving the avhdx file of a VHD Set that not in use. You can rename the extension from avhdx to vhdx. You can attach it to another VM or mount it in the host and get to the data. In essence this is a vhdx file. The “a” in avhdx stands for automatic. The meaning of this is that an vhdx is under control of the hypervisor and you’re not supposed to be manipulating it but let the hypervisor handle this for you. But as you can see for yourself if you try the above you can get to the data if that’s the only option left. Normally you should just leave it alone. It does however serve as proof that the VHD Set uses an standard virtuak disk (VHDX) file.

I’ll demonstrate this with an example below.

Fun with a backing storage file in a VHD Set

Shut down all the nodes of the guest cluster so that the VHD Set files are not in use. We then rename the virtual disk’s extension avhdx to vhdx.

image

You can then mount it on the host.

image

And after mounting the VHDX we can see the content of the virtual disk we put there when it was a CSV in that guest cluster.

image

We add some files while this vhdx is mounted on the host

image

Rename the virtual disk back to a avhdx extension.

image

We boot the nodes of the guest cluster and have a look at the data on the CSV. Bingo!

image

I’m NOT advocating you do this as a standard operation procedure. This is a demo to show you that the backing storage files are normal VHDX files that are managed by the hypervisor and as such get the avhdx extension (automatic vhdx) to indicate that you should not manipulate it under normal circumstances. But in a pinch, it a normal virtual disk so you can get to it with all options and tools at your disposal if needed.