Simplified SMB Multichannel and Multi-NIC Cluster Networks

Simplified SMB Multichannel and Multi-NIC Cluster Networks

One of the seemingly small feature enhancements in Windows Server 2016 Failover clustering is simplified SMB multichannel and multi-NIC cluster networks. In Windows 2016 failover clustering now recognizes and uses multiple NICs on the same subnet for cluster networking (Cluster & client access).

image

Why was this introduced?

The growth in the capabilities of the hardware ( Compute, memory, storage & networking) meant that failover clustering had to leverage this capability more easily and for more use cases than before. Talking about SMB, that now also is used for not “only” CSV and live migration but also for Storage Spaces Direct and Storage Replica.

  • It gives us better utilization of the network capabilities and throughput with Storage Spaces Direct, CSV, SQL, Storage Replica etc.
  • Failover clustering now works with multichannel as any other workload without the extra requirement of needing multiple subnets. This is more important that it seems to me at first. But in many environment getting another VLAN and/or extra subnet is a hurdle. Well that hurdle has gone.
  • For IPv6 Link local Subnets it just works, these are auto configured as cluster only networks.
  • The cluster Validation wizard won’t nag about it anymore and knows it’s a valid failover cluster configuration

See it in action!

You can find a quick demo of simplified SMB multichannel and multi-NIC cluster networks on my Vimeo channel here

image

In this video I demo 2 features. One is new and that is virtual machine compute resiliency. The other is an improved feature, simplified SMB multichannel and multi NIC cluster networks. The Multichannel demo is the first part of the video. Yes, it’s with RDMA RoCEv2, you know I just have to do SMB Direct when I can!

You can read more about simplified SMB multichannel and multi-NIC cluster networks on TechNet in here. Happy Reading!

Get-VMHostSupportedVersion

I wrote about this little Gem of a PowerShell Commandlet  Get-VMHostSupportedVersion before in here (there a bit more info on the impact of a VM configuration version in that blog). Now at TPv5 I took a new peak and what do we find?

image

We now have version virtual machine configuration version 7.1 at TPv5. We also got 2 new version ID’s 254.0 for Prerelease and 255.0 for Experimental. Clearly Microsoft has plans here.  I’ll update this blog with a link to the documentation when I find it.

All bets are open as to where we’ll land at RTM for the virtual machine configuration version. I’m guessing that we’re feature complete at Technical Preview 5 but version numbers can get funky. Will all TP version be supported at RTM? Normally upgrades from beta / preview versions are not supported but on the other hand some people in early adopter programs are working on it already so I’m guessing they will. We’ll see, but that’s where I put my money.

NUMA Spanning and Virtual NUMA in Hyper-V

When it comes to NUMA Spanning and Virtual NUMA in Hyper-V or anything NUMA related actually in Hyper-V virtualization this is one  subject that too many people don’t know enough about. If they know it they often could be helped by some more in depth information and examples on anything NUMA related in Hyper-V virtualization.

image

Some run everything on the defaults and  never even learn more l they read or find they need to dive in deeper for some needs or use cases. To help out many with some of the confusion or questions they struggled with in regards to Virtual NUMA, NUMA Topology, NUMA Spanning and their relation to static and dynamic memory.image

As I don’t have the time to answer all questions I get in regards to this subject I have written an article on the subject. I’ve published it as a community effort on the StarWind Software blog and you can find it here: A closer look at NUMA Spanning and virtual NUMA settings

I think t complements the information on this subject on TechNet well and it also touches on Windows Server 2016 aspects of this story. I hope you enjoy it!

Issues to watch out for when configuring Discrete Device Assignment

When you’re discovering how to get discrete device assignment to work you have some potential bumps that might trip you up. So what are the issues to watch out for when configuring Discrete Device Assignment? We’ll share some here but note this is from testing with Windows Server 2016 Technical Preview 4. Changes can and probably will happen before RTM.

Make sure your VMs are running on the latest configuration version. That means 7.x at the time of writing. Many of the new features require this as discussed in Windows Server 2016 TPv4 Hyper-V brings virtual machine configuration version 7

Check the configuration version of the VM

When you try to add a GPU to a VM via Discrete Device Assignment you’ll get an error when the VM has version 5.0 in stead of 7.x. This can easily happen when you move VMs from older versions to a shiny new Windows 2016 environment as in the example below:

image

Naturally all of this is logged in the Hyper-V-VMMS Admin logs as well

‘W2K12R2’ cannot add device ‘Virtual Pci Express Port’ until the virtual machine is upgraded. (Virtual machine ID 592A920F-B0E9-480C-9052-A397B377BCC9)

Mind your Dynamic Memory settings

Another thing you need to watch out for is that when you use dynamic memory the startup memory and the minimum memory values have to match. So minimum memory cannot be lower than the startup memory. Do not that this is TPv4 and things might change.

image

Cannot add the device to ‘W2K12R2’ as that virtual machine has Dynamic Memory configured with different startup memory and minimum memory values. When adding a device, the virtual machine must be configured with equal startup memory and minimum memory values.(Virtual machine ID 592A920F-B0E9-480C-9052-A397B377BCC9)

If you try to change this on a VM with discrete device assignment enabled you’ll also find that this isn’t allowed.

image

Cannot perform the operation for ‘W2K12R2’ as the specified memory settings are not compatible for device assignment. The startup memory size and minimum memory size must be equal when Dynamic Memory is enabled and devices are also assigned.(Virtual machine ID 592A920F-B0E9-480C-9052-A397B377BCC9)

Set the automatic stop action to “Turn off the virtual machine”

I already mentioned this in the blog but you need to make sure that the automatic stop action for the virtual machine is set to “turn off the virtual machine and not to the default of “save the virtual machine state”. You cannot use DDA unless you do so.

image

Cannot add the device to ‘W2K12R2’ as that virtual machine is configured to go to saved state on host shutdown. (Virtual machine ID 592A920F-B0E9-480C-9052-A397B377BCC9)

Again, changing this on a VM that has DDA assigned will not work.

image

Discrete means one on one

Remember that you cannot assign a device to more than one VM. The thing here is it won’t block you when both VMs are shut down, at least not in TPv4.  But It’s dedicated and won’t work. When you do and you try to start any of those VMs it won’t work.

image

An error occurred while attempting to start the selected virtual machine(s).

‘RFX-WIN10ENT’ failed to start.

Virtual Pci Express Port (Instance ID 9B15DD32-5F94-46EF-8524-501007830322): Failed to Power on with Error ‘The device is in use by an active process and cannot be disconnected.’.

When you try to assign a GPU to a VM that is assigned to a running VM it will block you!

The dive Cleary identifies the VM the device is already assigned to.

Add-VMAssignableDevice -LocationPath $LocationPathOfDismountedDA -VMName RFX-WIN10ENT
Add-VMAssignableDevice : ‘RFX-WIN10ENT’ failed to add resources to ‘RFX-WIN10ENT’.
Virtual Pci Express Port (Instance ID EA7CB907-C38A-4396-97E0-A9A8F3C2D1B0): Failed to Power on with Error ‘The device
is in use by an active process and cannot be disconnected.’.
‘RFX-WIN10ENT’ failed to add resources. (Virtual machine ID 425A366E-E380-4D8C-AADE-DE16EAC0A104)
‘RFX-WIN10ENT’ Virtual Pci Express Port (Instance ID EA7CB907-C38A-4396-97E0-A9A8F3C2D1B0): Failed to Power on with
Error ‘The device is in use by an active process and cannot be disconnected.’ (0x80070964). (Virtual machine ID
425A366E-E380-4D8C-AADE-DE16EAC0A104)
Could not allocate the PCI Express device with the Plug and Play Device Instance path
‘PCIP\VEN_10DE&DEV_0FF2&SUBSYS_101210DE&REV_A1\6&17F903&0&00400010’ because it is already in use by another VM.
At line:1 char:1
+ Add-VMAssignableDevice -LocationPath $LocationPathOfDismountedDA -VMN …
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+ CategoryInfo          : NotSpecified: (:) [Add-VMAssignableDevice], VirtualizationException
+ FullyQualifiedErrorId : OperationFailed,Microsoft.HyperV.PowerShell.Commands.AddVmAssignableDevice

Shut down your VM to make change to DDA

Last but not least to use DDA (assign, configure) with a VM you have to shut it down.  Removing devices whilst the VM is running isn’t blocked. But, he results can be quite “harsh”. This is me removing a DDA GPU form a Windows 2012 R2 VM whilst it’s running.

image

The fun part is that you can add it again while the VM is running and with will work, but it’s not a healthy thing to do.

As stated above, these notes are from testing with Windows Server 2016 Technical Preview 4 so thing can still change. Happy testing!