Virtual switch QoS mode during migrations

Introduction

In Shared nothing live migration with a virtual switch change and VLAN ID configuration I published a sample script. The script works well. But there are two areas of improvement. The first one is here in Checkpoint references a non-existent virtual switch. This post is about the second one. Here I show that I also need to check the virtual switch QoS mode during migrations. A couple of the virtual machines on the source nodes had absolute minimum and/or maximum bandwidth set. On the target nodes, all the virtual switches are created by PowerShell. This defaults to weight mode for QoS, which is the more sensible option, albeit not always the easiest or practical one for people to use.

Virtual switch QoS mode during migrations

First a quick recap of what we are doing. The challenge was to shared nothing live migrate virtual machines to a host with different virtual switch names and VLAN IDs. We did so by adding dummy virtual switches to the target host. This made share nothing live migration possible. On arrival of the virtual machine on the target host, we immediately connect the virtual network adapters to the final virtual switch and set the correct VLAN IDs. That works very well. You drop 1 or at most 2 pings, this is as good as it gets.

This goes wrong under the following conditions:

  • The source virtual switch has QoS mode absolute.
  • Virtual network adapter connected to the source virtual switch has MinimumBandwidthAbsolute and/or MaximumBandwidth set.
  • The target virtual switch with QoS mode weighted

This will cause connectivity loss as you cannot set absolute values to a virtual network attached to a weighted virtual switch. So connecting the virtual to the new virtual switch just fails and you lose connectivity. Remember that the virtual machine is connected to a dummy virtual switch just to make the live migration work and we need to swap it over immediately. The VLAN ID does get set correctly actually. Let’s deal with this.

Steps to fix this issue

First of all, we adapt the script to check the QoS mode on the source virtual switches. If it is set to absolute we know we need to check for any settings of MinimumBandwidthAbsolute and MaximumBandwidth on the virtual adapters connected to those virtual switches. These changes are highlighted in the demo code below.

Secondly, we adapt the script to check every virtual network adapter for its bandwidth management settings. If we find configured MinimumBandwidthAbsolute and MaximumBandwidth values we set these to 0 and as such disable the bandwidth settings. This makes sure that connecting the virtual network adapters to the new virtual switch with QoS mode weighted will succeed. These changes are highlighted in the demo code below.

Finally, the complete script

#The source Hyper-V host
$SourceNode = 'NODE-A'
#The LUN where you want to storage migrate your VMs away from
$SourceRootPath = "C:\ClusterStorage\Volume1*"
#The source Hyper-V host

#The target Hypr-V host
$TargetNode = 'ZULU'
#The storage pathe where you want to storage migrate your VMs to
$TargetRootPath = "C:\ClusterStorage\Volume1"

$OldVirtualSwitch01 = 'vSwitch-VLAN500'
$OldVirtualSwitch02 = 'vSwitch-VLAN600'
$NewVirtualSwitch = 'ConvergedVirtualSwitch'
$VlanId01 = 500
$VlanId02 = 600
  
if ((Get-VMSwitch -name $OldVirtualSwitch01 ).BandwidthReservationMode -eq 'Absolute') { 
    $OldVirtualSwitch01QoSMode = 'Absolute'
}
if ((Get-VMSwitch -name $OldVirtualSwitch01 ).BandwidthReservationMode -eq 'Absolute') { 
    $OldVirtualSwitch02QoSMode = 'Absolute'
}
    
#Grab all the VM we find that have virtual disks on the source CSV - WARNING for W2K12 you'll need to loop through all cluster nodes.
$AllVMsOnRootPath = Get-VM -ComputerName $SourceNode | where-object { $_.HardDrives.Path -like $SourceRootPath }

#We loop through all VMs we find on our SourceRoootPath
ForEach ($VM in $AllVMsOnRootPath) {
    #We generate the final VM destination path
    $TargetVMPath = $TargetRootPath + "\" + ($VM.Name).ToUpper()
    #Grab the VM name
    $VMName = $VM.Name
    $VM.VMid
    $VMName

    #If the VM is still clusterd, get it removed form the cluster as live shared nothing migration will otherwise fail.
    if ($VM.isclustered -eq $True) {
        write-Host -ForegroundColor Magenta $VM.Name "is clustered and is being removed from cluster"
        Remove-ClusterGroup -VMId $VM.VMid -Force -RemoveResources
        Do { Start-Sleep -seconds 1 } While ($VM.isclustered -eq $True)
        write-Host -ForegroundColor Yellow $VM.Name "has been removed from cluster"
    }
    #If the VM checkpoint, notify the user of the script as this will cause issues after swicthing to the new virtual
    #switch on the target node. Live migration will fail between cluster nodes if the checkpoints references 1 or more
    #non existing virtual switches. These must be removed prior to of after completing the shared nothing migration.
    #The script does this after the migration automatically, not before as I want it to be untouched if the shared nothing
    #migration fails.

    $checkpoints = get-vmcheckpoint -VMName $VM.Name

    if ($Null -ne $checkpoints) {
        write-host -foregroundcolor yellow "This VM has checkpoints"
        write-host -foregroundcolor yellow "This VM will be migrated to the new host"
        write-host -foregroundcolor yellow "Only after a succesfull migration will ALL the checpoints be removed"
    }
    
    #Do the actual storage migration of the VM, $DestinationVMPath creates the default subfolder structure
    #for the virtual machine config, snapshots, smartpaging & virtual hard disk files.
    Move-VM -Name $VMName -ComputerName $VM.ComputerName -IncludeStorage -DestinationStoragePath $TargetVMPath -DestinationHost $TargetNode
    
    $MovedVM = Get-VM -ComputerName $TargetNode -Name $VMName

    $vNICOnOldvSwitch01 = Get-VMNetworkAdapter -ComputerName $TargetNode -VMName $MovedVM.VMName | where-object SwitchName -eq $OldVirtualSwitch01
    if ($Null -ne $vNICOnOldvSwitch01) {
        foreach ($VMNetworkadapater in $vNICOnOldvSwitch01) {   
            if ($OldVirtualSwitch01QoSMode -eq 'Absolute') { 
                if (0 -ne $VMNetworkAdapter.bandwidthsetting.Maximumbandwidth) {
                    write-host -foregroundcolor cyan "Network adapter $VMNetworkAdapter.Name of VM $VMName MaximumBandwidth will be reset to 0."
                    Set-VMNetworkAdapter -Name $VMNetworkAdapter.Name -VMName $MovedVM.Name -ComputerName $TargetNode -MaximumBandwidth 0
                }
                if (0 -ne $VMNetworkAdapter.bandwidthsetting.MinimumBandwidthAbsolute) {
                    write-host -foregroundcolor cyan "Network adapter $VMNetworkAdapter.Name of VM $VMName MaximuBandwidthAbsolute will be reset to 0."
                    Set-VMNetworkAdapter -Name $VMNetworkAdapter.Name -VMName $MovedVM.Name -ComputerName $TargetNode -MinimumBandwidthAbsolute 0
                }
            }
                
            write-host 'Moving to correct vSwitch'
            Connect-VMNetworkAdapter -VMNetworkAdapter $vNICOnOldvSwitch01 -SwitchName $NewVirtualSwitch
            write-host "Setting VLAN $VlanId01"
            Set-VMNetworkAdapterVlan  -VMNetworkAdapter $vNICOnOldvSwitch01 -Access -VLANid $VlanId01
        }
    }

    $vNICsOnOldvSwitch02 = Get-VMNetworkAdapter -ComputerName $TargetNode -VMName $MovedVM.VMName | where-object SwitchName -eq $OldVirtualSwitch02
    if ($NULL -ne $vNICsOnOldvSwitch02) {
        foreach ($VMNetworkadapater in $vNICsOnOldvSwitch02) {
            if ($OldVirtualSwitch02QoSMode -eq 'Absolute') { 
                if ($Null -ne $VMNetworkAdapter.bandwidthsetting.Maximumbandwidth) {
                    write-host -foregroundcolor cyan "Network adapter $VMNetworkAdapter.Name of VM $VMName MaximumBandwidth will be reset to 0."
                    Set-VMNetworkAdapter -Name $VMNetworkAdapter.Name -VMName $MovedVM.Name -ComputerName $TargetNode -MaximumBandwidth 0
                }
                if ($Null -ne $VMNetworkAdapter.bandwidthsetting.MinimumBandwidthAbsolute) {
                    write-host -foregroundcolor cyan "Network adapter $VMNetworkAdapter.Name of VM $VMName MaximumBandwidth will be reset to 0."
                    Set-VMNetworkAdapter -Name $VMNetworkAdapter.Name -VMName $MovedVM.Name -ComputerName $TargetNode -MinimumBandwidthAbsolute 0
                }
            }
            write-host 'Moving to correct vSwitch'
            Connect-VMNetworkAdapter -VMNetworkAdapter $vNICsOnOldvSwitch02 -SwitchName $NewVirtualSwitch
            write-host "Setting VLAN $VlanId02"
            Set-VMNetworkAdapterVlan  -VMNetworkAdapter $vNICsOnOldvSwitch02 -Access -VLANid $VlanId02
        }
    }

    #If the VM has checkpoints, this is when we remove them.
    $checkpoints = get-vmcheckpoint -ComputerName $TargetNode -VMName $MovedVM.VMName

    if ($Null -ne $checkpoints) {
        write-host -foregroundcolor yellow "This VM has checkpoints and they will ALL be removed"
        $CheckPoints | Remove-VMCheckpoint 
    }
}

Below is the output of a VM where we had to change the switch name, enable a VLAN ID, deal with absolute QoS settings and remove checkpoints. All this without causing downtime. Nor did we change the original virtual machine in case the shared nothing migration fails.

Some observations

The fact that we are using PowerShell is great. You can only set weighted bandwidth limits via PowerShell. The GUI is only for absolute values and it will throw an error trying to use it when the virtual switch is configured as weighted.

This means you can embed setting the weights in your script if you so desire. If you do, read up on how to handle this best. trying to juggle the weight settings to be 100 in a dynamic environment is a bit of a challenge. So use the default flow pool and keep the number of virtual network adapters with unique settings to a minimum.

Conclusion

To avoid downtime we removed all the set minimum and maximum bandwidth settings on any virtual network adapter. By doing so we ensured that the swap to the new virtual switch right after the successful shared nothing live migration will succeed. If you want you can set weights on the virtual network adapters afterward. But as the bandwidth on these new hosts is now a redundant 25 Gbps, the need was no longer there. As a result, we just left them without. this can always be configured later if it turns out to be needed.

Warning: this is a demo script. It lacks error handling and logging. It can also contain mistakes. But hey you get it for free to adapt and use. Test and adapt this in a lab. You are responsible for what you do in your environments. Running scripts downloaded from the internet without any validation make you a certified nut case. That is not my wrongdoing.

I hope this helps some of you. thanks for reading.

DELL released Replay Manager 8.0

DELL Released Replay Manager 8.0

On September the 4th 2019 DELL released Replay Manager 8.0 for Microsoft Servers. This brings us official Windows Server 2019 support. You can download it here: https://www.dell.com/support/home/us/en/04/product-support/product/storage-sc7020/drivers The Dell Replay Manager Version 8.0 Administrator’s Guide and release notes are here: https://www.dell.com/support/home/us/en/04/product-support/product/storage-sc7020/docs

Replay Manager 8.0.0.13 was released early September 2019

I have Replay Manager 8.0 up and running in the lab and in production. The upgrade went fast and easy. And everything kept working as expected. The good news is that Replay Manager 8 service is compatible with the Replay Manager 7.8 Manager and vice versa. This means there was no rush to upgrade everything asap. We could do smoke testing at a releaxed pace before we upgraded all hosts.

Replay Manager 8.0 adds official support for Windows Server 2019 and Exchange Server 2019. I have tested Windows Server 2019 with Replay Manager 7.8 as well for many months. I was taking snapshots every 30 minutes for months with very few issues actually. But no we had oficial support. Replay Manager 8.0 also introduces support for SCOS 7.4.

No improvements with Hyper-V backups

Now, we don’t have SCOS 7.4 running yet. This will take another few weeks to go into general available status. But for now, with both Windows Server 2016 and 2019 host we noticed the following dissapointing behaviour with Hyper-V workloads. Replay Manager 8 still acts as a Windows Server 2012 R2 requestor (backup software) and hence isn’t as fast and effectice as it could be. I actually do not expect SCOS version 7.4 to make a difference in this. If you leverage the hardware VSS provider with backup software that does support Windows Server 2016/2019 backup mechanisms for Hyper-V this is not an issue. For that I mostly leverage Veeam Backup & Replication.

It is a missed opportunity unless I am missing something here that after so many years Replay Manager requestor still does not support Windows Server 2016/2019 native Hyper-V backups capabilities. And once again, I didn’t even mention the fact that the indivual Hyper-V VMs backups need modernization in Replay Manager to deal with VM mobility. See https://blog.workinghardinit.work/2017/06/02/testing-compellent-replay-manager-7-8/. I also won’t mention Live Volumes as I did then. Now as I leverage Replay Manager as a secondary backup method, not a primary I can live with this. But it could be so much better. I really need a chat with the PM for Replay Manager. Maybe at Dell Technologies World 2020 if I can find a sponsor for the long haul flight.

Latency kills

Introduction

I was investigating a very problematic Windows Server 2016 Hyper-V cluster. That cluster was just performing horribly. “Everything” was hanging, stalling, crashing and RHS.exe errors where flying around while WER dumps got created by the dozen. Things were extremely slow up to the points functionality was just failing. The “fun” thing was that the cluster validation wizard while slow gave that cluster a big thumb up and a supported status as all was well.

Prying around

Time to pry around a bit and see if we could find something wrong. We save live migrations stall, fail, last forever in pending or get stuck at a certain percentage, sometimes finally succeeding with ridiculous blackout times. We could not open up virtual machine properties or very slowly. The FCI GUI was highly unresponsive but so was the Hyper-V Manager GUI or even PowerShell. Those were hanging at even loading the virtual machines or enumerating them with Get-VM. Everything was slow to the point it timed out or crashed. Restarting the services (Cluster, Hyper-V) didn’t do anything and restarting VMMS was super slow or just got stuck. It was a depressing sight for which people tended to blame Hyper-V / Microsoft.

As the title gives away it was latency. Not just ordinary high latency. Real bad latency. That kind of latency kills. Extreme latency produces symptoms that are similar to bugs or corrupt components of roles and features. We have a tendency to look at those first in the event logs and then we look at the network and its usual suspects (VMQ, SET, DCB). But nothing pointed to an issue that I could find.

So, storage maybe?Well we did find one Hyper-V host in the cluster with one HBA port producing too many error so we disabled that FC port for testing. No joy the Hyper-V cluster after a clean reboot of all nodes remained problematic. So on to the storage array itself.

Well holy smoke! On the two volumes for CSV in those cluster we saw latencies that were so bad I could not even believe a single VM would boot. It actually made my appreciation for Hyper-V and clustering grow as it managed to do at least a couple of things. With such latencies I would expect the services to just crash & call it a day.

clip_image002

The horrific latency on one of the CSV LUNs.

Looking at the logs we saw that the latencies occurred on the FC HBAs of the controllers. Each one above 50ms, peaking to 150-250ms and one huge peak at almost 500ms. We saw this on all four HBA’s.

clip_image004

The latency on one of the 4 FC HBA’s on one of the controllers. Not a good day. All HBAs had high latencies like this.

The issues were not at the host level (host HBA’s) or not even at the IOPS/bandwidth level of the storage itself. The latency for some reason was spiking. Further investigation lead to the conclusion that the issue was related to synchronous replication going totally wrong. Moving the replication mode to asynchronous fixed that. We’re now investigating why this happened and how to prevent this from happening again. But that’s another story.

clip_image006

Latency on one of the 4 FC HBAs on one of the controllers after we fixed the issue.

Do not assume anything

So, there you go. Everything depends on everything in some direct or indirect way. It’s all connected and that my friends, is why I’m a proponent of “service resilience engineering” where the responsible team owns the entire stack. That’s is how you can act fast.

Windows Server 2016 RDMA and the Hyper-V vSwitch – Part I

Introduction

With Windows Server 2012 R2, using both RDMA and the Hyper-V vSwitch on the same host required separate physical network adapters (pNICs). There are 2 reasons for this.

  • First a vSwitch is generally created with a native Windows NIC team. Such a NIC team does not expose RDMA capabilities.
  • Second is that in Windows Server 2012 R2 you cannot expose RDMA capabilities via a vSwitch, even when you are using a non-teamed RDMA capable NIC.

As a result, the need for RDMA required more NICs on the Hyper-V hosts and/or a fully converged had some serious drawbacks. As servers have been quite capable and our VMs serve ever more intensive workloads this was not dramatic. Leveraging 2*10Gbps for a vSwitch and 2*10Gbps for redundant RDMA / SMB Direct traffic have long been one of my favorite designs. It leaves room for other traffic, such as backups, and it allows for high VM density. But with 40Gbps NICs that is overkill and a tad expensive in many scenarios, even when connecting to a SOFS share for Hyper-V storage, so 4*40Gbps on a Hyper-V host is not something I ever saw in real life.

Windows Server 2016 can expose RDMA capabilities via a vSwitch even without SET

What many people seem to have missed is that reason 2 has gone in Windows Server 2016 Hyper-V. Reason 1 still holds true. But that has been solved by Switch Embedded Teaming (SET). This means that you actually do not need SET to leverage RDMA with an vSwitch in Windows Server 2016 Hyper-V. You can do this as follows:

#Create a vSwith
New-VMSwitch -Name RDMACapable-vSwitch -NetAdapterName "NODE-A-S4P1-SW12P05-SMB1"

#Now add a host vNIC for the SMB Direct Traffic
Add-VMNetworkAdapter -SwitchName RDMACapable-vSwitch -Name SMB1 -ManagementOS

#Enable RDMA on it
Enable-NetAdapterRDMA "vEthernet (SMB1)"

#Grab that vNIC on the management OS and set the VLAN - PFC requires tagged VLANs
$NicSMB1 = Get-VMNetworkAdapter -Name SMB1 -ManagementOS
Set-VMNetworkAdapterVLAN -VMNetworkAdapter $NicSMB1 -Access -VlanID 110


Below is what this looks like. We have one vNIC on the management OS leveraging RDMA/SMB Direct consuming all 10Gbps if the NIC we connected to the vSwith. This is a nice lab demo but you can see this isn’t perhaps the best idea in real life.

clip_image002

Other things to note

Do realize this still requires the pNIC to be RDMA capable. This is not some sort of soft RoCE or other software RDMA magic as of today. The pNIC also has to have RDMA enabled or virtual NIC won’t be able to leverage RDMA but fall back to SMB (Multichannel only) instead of SMB Direct. Likewise, RDMA has to be enabled on the vNIC as well. So don’t forget, RDMA must be enabled on both the pNIC and the vNIC for this to work.clip_image004

DCB’s PFC/ETS requires a tagged VLAN to carry the priority, do don’t forget to tag the vNIC. There is actually no need to tag the pNIC as long as the switch port has the tagged VLAN set – most likely as a trunk or in general mode. If you don’t tag consistently across the entire network stack you’ll have network issues anyway and RDMA performance will be bad if it works at all.

Finally, don’t forget this is example is not using VMM /Network Controller and as such is using Set-VMNetworkAdapterVLAN and not Set-VMNetworkAdapterIsolation.

In real life, we need better and more than a single NIC vSwitch

The caveat here is that, while you have a converged setup, you have no redundancy for the vSwitch (there is no team). This also means that you’re are limited to a single NIC in regards to throughput for that vSwith. Depending on the needs of the solutions that might be perfectly fine. It it’s not – in most real-world scenarios you’ll need redundancy – you have to use SET in a converged scenario. That’s what we’ll take a look at in part 2. Then there is the question about QoS as you don’t want SMB Direct traffic to consume to much bandwidth at will. That’s still another issue to discuss and address.