Configuring an interface bond in a Ubuntu Hyper-V guest

Introduction

In this post, we take a look at configuring an interface bond in a Ubuntu Hyper-V guest. But first a quick word about NIC teaming and Hyper-V. In real life, teaming is most often done on physical hardware. But in the lab, or for some edge production cases, you might want to use it in virtual machines. The use case here is virtual machines used for testing and knowledge transfer. We are teaching about creating Veeam Backup & Configuration hardened repositories with XFS and immutability. In that lab, we are emulating a NIC team on hardware servers.

When you need redundant, high available networking for your Hyper-V guests, you normally create a NIC team on the host. You then use that NIC team to make your vSwitch. You can use a traditional LBFO team (depreciated) or a SET switch. The latter is the current technology and the way forward. But in this lab scenario, I am using LBFO, native Windows native NIC teaming.

Configuring an interface bond in a Ubuntu Hyper-V guest
99.9% of all use cases will use teaming on the Hyper-V host

Host teaming provides both bandwidth aggregation, redundancy, and failover. Typically, you do not mess around with NIC teaming in the guest in 99.99% of cases. Below we see a figure showing guest teaming. You need to use two physical NICs for genuine redundancy. Each with its separate virtual switch and uplinked to separate physical switches. Beware that only switch independent teaming is supported in the guest OS, so configure the switches and switch ports accordingly.

Configuring an interface bond in a Ubuntu Hyper-V guest
Hyper-V in guest NIC teaming

In-guest teaming is rarely used for production workloads, that is, bar some exceptions with SR-IOV, but that is another discussion. However, you might have a valid reason to use NIC teaming for lab work, testing, documenting configurations, teaching, etc. Luckily, that is easy to do. Hyper-V has a setting for your vNICs, enabling them to be functional members of a NIC team in a Windows guest OS. Als long as that OS supports native teaming. That is the case for Windows Server 2012 and later.

NIC teaming inside a Hyper-V Guest

For each vNIC member of the NIC team in the guest, you must put a checkmark to “enable this network adapter to be part of a team in the guest operating system” there is nothing more to it. The big caveat here is that each member must reside on a different external vSwitch for failover to work correctly. Otherwise, you will see a “The virtual switch lacks external connectivity” error on the remaining when failing over and packet loss.

Enable NIC teaming o the vNIC that are going to be team membersthe Hyper-V settings

There is nothing more you need to make it work perfectly in a Windows guest VM. As you can see in the image below, both my LAN NIC and the NIC get an address from the DHCP server.

Functional team in the virtual machine. Do test failover to make sure you got it right?

That’s great. But sometimes, I need to have a NIC team inside a Linux guest virtual machine. For example, recently, on Ubuntu 20.04, I went through my typical motions to get in guest NIC teaming or bonding in Linux speak. But, much to my surprise, I did not get an IP address from my DHCP server on my Ubuntu 20.04 guest bond. So, what could be the cause?

Configuring an interface bond in a Ubuntu Hyper-V guest

In Ubuntu, we use netplan to configure our networking and in the image below you can see a sample configuration.

A minimal bond configuration in Ubuntu

I have created a bond using eth0 and eth1, and we should get an IP address from DHCP. The bonding mode is balance-rr. But why I am not getting an IP address. I did check the option “Enable this network adapter to be part of a team in the guest operating system” on both member vNICs.

Well, let’s look at the nic interfaces and the bond. There we see something exciting.

Configuring an interface bond in a Ubuntu Hyper-V guest
Note that the bond and it’s member interfaces have the same MAC address that does not come from the Hyper-V host pool

Note that the bond has a MAC address that is the same as both member interfaces. Also, note that this MAC address does not come from the Hyper-V host MAC address pool and is not what is assigned to the vNIC by Hyper-V as you can see in the image below! That is the big secret.

With MAC addressed unknown to the hypervisor, this smells of something that requires MAC spoofing, doesn’t it? So, I enabled it, and guess what? Bingo!

So what is the difference with Windows when configuring an interface bond in a Ubuntu Hyper-V guest?

The difference with Windows is that an interface bond in an Ubuntu Hyper-V guest requires MAC address spoofing. You have to enable MAC Spoofing on both vNICs members of the Ubuntu virtual machine bond. The moment you do that, you will see you get a DHCP address on the bond and get network connectivity. But why is this needed? In Ubuntu (or Linux in general), the bond interface and its members have a generated MAC address assigned. It does not take one of the MAC addresses of the member vNICs. So, we need MAC spoofing enabled on both member vNIC in the Hyper-V settings for this to work! In a Windows guest, the LBFO team gets one of the MAC addresses of its member vNICs assigned. As such, this does not require NIC spoofing.

With Ubuntu (Linux) you don’t even have to check “enable this network adapter to be part of a team in the guest operating system” on the member vNICs. Note that a guest Linux bond does not need every member interface on a separate vSwitch for failover to work. Not even if you enable “enable this network adapter to be part of a team in the guest operating system.” However, the latter is still ill-advised when you want real redundancy and failover.

Set the Hyper-V volume-specific settings in Veeam with PowerShell

Set the Hyper-V volume-specific settings in Veeam with PowerShell

When adding and configuring Hyper-V servers to Veeam you can set the Hyper-V volume-specific settings in Veeam with PowerShell or in the GUI.

  1. Select what VSS provider to use (Windows native VSS or a Hardware VSS provider)
  2. Configure the maximum number of concurrent snapshots to allow for the volume

I will show how to Set the Hyper-V volume-specific settings in Veeam with PowerShell. But first, let’s remind our selves of what it is used for.

The first one is easy. You will use the Windows native VSS unless you have a hardware VSS provider installed and configured. These come from your storage array vendor. Hardware VSS providers are only available for volumes that are provided by that storage array. If you don’t set them manual Veeams scans your host en picks the best option. It does so based on the type of volume and the availability of a hardware VSS provider or not.

The second option’s meaning depends on the version of Windows and also on whether you leverage a hardware VSS provider or not. You see the value of the maximum number of concurrent snapshots doesn’t always result in the same behavior you might expect.

Lets look at the documentation

I invite you to read the Veeam documentation on this subject. below you will find an excerpt with my annotations.

Follow the link for each option to learn more in the on line Veeam documentation.

  1. For Microsoft Hyper-V 2012 R2 and earlier, the default is set to simultaneously store 4 snapshots of one volume. To change this number, specify the Max snapshots value. It is not recommended that you increase the number of snapshots for slow storage. Many snapshots existing at the same time may cause VM processing failures.
  2. For Microsoft Hyper-V Server 2016 and later. You can simultaneously store 4 VM checkpoints on one volume. To change this number, specify the Max snapshots value. Note that this limitation works only for (recovery) checkpoints created during Veeam Backup & Replication data protection tasks. When you still use host VSS provider in your backup process (with a SAN hardware VSS provider, combined with off-host Hyper-V proxies) this acts like before. It will not limit the number of concurrent VM backup jobs. That only happens when the Hyper-V recovery checkpoints are the only thing in play. This means that for an S2D or Azure Stack HCI solution for example you will need to increase this value if you want to have more than 4 VM backed up simultaneously on that volume. No matter how many concurrent tasks you set on your Hyper-Hosts and repositories. By the way, remember that a task does not equal a VM but a disk per VM / backup file per VM. In a simple example with nothing else in play, this means that 16 tasks can be 4 VMs if those VMs all happen to have 4 disks, etc.
Set the Hyper-V volume-specific settings in Veeam with PowerShell
The default setting for maximum concurrent snapshots is 4

Now we have that out of the way. I find it tedious to do all this in the GUI. Especially so in larger environments and during testing in the lab or prior to taking a solution into production. There can be many hosts and even more volumes to configure. This is why I Set the Hyper-V volume-specific settings (and other configurations) in Veeam with PowerShell.

How to set the Hyper-V volume-specific settings in Veeam with PowerShell

So here I will share how to do this in PowerShell. It is not very difficult. Below snippet is the crux of what you need to integrate into your own scripts. Below I grab all the volumes on all the nodes of a cluster and set the MaxSnapShot value to 8. Tun a Hyper-V backup job against those CSV’s with 10 single disks VMs. You’ll see we can no have up to 8 VMs being backed up concurrently instead of 4.

I am also showing how to set the VSS provider. Warning, PowerShell will let you set a wrong provider. The GUI protects against that, So pay attention here.

#Grab the Cluster whose nodes volumes we want to configure
$Cluster = Get-Vbrserver -Name W2K19-LAB.datawisetech.corp -type HvCluster

#Grab the correct Hyper-V hosts based on the parentid (cluster they belong to)
$ClusterNodes = Get-VBRServer -Type HvServer | Where ParentID -eq $Cluster.Id 

Foreach ($ClusterNode in $ClusterNodes) {
    $ServerVolumes = Get-VBRHvServerVolume -Server $ClusterNode.Name
    $Provider = Get-VBRHvVssProvider -Server $ClusterNode.Name -Name "Microsoft CSV Shadow Copy Provider"
    Foreach ($Volume in $ServerVolumes) {
        if ($Volume.Type -eq "CSV") {

            Set-VBRHvServerVolume -Volume $Volume -MaxSnapshots 8 -VSSProvider $Provider
        }
    }
}
Set the Hyper-V volume-specific settings in Veeam with PowerShell
Only the CSV volumes have had their Max concurrent snapshot increased to 8.

Conclusion

I have shown you how to set the Hyper-V volume-specific settings in Veeam with PowerShell (VSSProvider/max concurrent snapshots

The max concurrent snapshots value is not the only setting determining how many VMs you can backup concurrently in one job. But it is an important one to know about when leveraging recovery checkpoints. You also need to mind max concurrent tasks.

Every virtual disk being backed up counts as a task. So a virtual machine with 3 disks will consume 3 tasks out of the max concurrent tasks you have set on the backup proxy. Don’t go overboard. Count cores when determining how to set these values. Also, remember that taking it easy to speed things up is a rule in backups. There is no speed gained by trying to do more than your cores can handle. Or, when you have plenty of cores by, depleting IOPS on your storage.

I will show you how to configure those with PowerShell in future blog posts.

TechNine SMB Technology User Group

Introduction

The TechNine SMB Technology User Group has a meetup on December the 4th 2019. It will be the last event of 2019.

It takes place at Ingram Micro who is hosting this at Ingram Micro, Hermeslaan 1b / 3rd floor, B-1831 Diegem. Thanks for this.

There are 3 speakers who will share some of their insights with you.

Hyper-V backups. The good, the bad, the ugly

This is the session I am giving. I presented this before and I have found that it still delivers a lot of value and insights to people. Every time it has helped some attendees out. We’ll discuss how Hyper-V backups have improved and why. I will also share some insights into what can trip you up with Windows Server 2016/2019.

7 Habits every Azure Admin Must Have

Wim Matthyssen, a well known Azure specialist at Cegeka and is an experienced speaker delivers this one. It will help you be a better Azure admin. As he is a Microsoft Certified Trainer his teaching talents are well developed. We can only benefit from this.

The mystery session

Diego Lens is an experienced trainer, speaker, and Citrix expert. He works as a Cloud Technology Strategist and he will bring us a mystery session. Actually, it is such a mystery I honestly have no idea what it is all about. You’ll have to attend to find out! a talk about Azure Migrate This is “the” tool for migrating workloads to Azure. Is this the forklift for lift and shift or is there more to know? Come and find out!

TechNine SMB Technology User Group event on December 4th 2019

Calendar:
18h00: Welcome & Food
18h30: Hyper-V backups. The good, the bad, the ugly – Didier Van Hoye
19h15: 7 habits every Azure admin must have – Wim Matthyssen
20h00: Break
20h15: Mystery session 😊 – Diego Lens
21h00: Networking and Questions
21h30: End
When: Woensdag 4/12 om 18:00
Where: Ingram Micro, Hermeslaan 1b / 3rd floor, B-1831 Diegem

Register

If you are interested in attending just navigate to the TechNine SMB Usergroup website read up un the other sessions and register. It is as easy as that. If you don’t even have the time for that, the blow button takes you directly to the Eventbrite site for registration.

I hope to see you at the TechNine SMB Technology User Group on December 4th 2019.

Virtual switch QoS mode during migrations

Introduction

In Shared nothing live migration with a virtual switch change and VLAN ID configuration I published a sample script. The script works well. But there are two areas of improvement. The first one is here in Checkpoint references a non-existent virtual switch. This post is about the second one. Here I show that I also need to check the virtual switch QoS mode during migrations. A couple of the virtual machines on the source nodes had absolute minimum and/or maximum bandwidth set. On the target nodes, all the virtual switches are created by PowerShell. This defaults to weight mode for QoS, which is the more sensible option, albeit not always the easiest or practical one for people to use.

Virtual switch QoS mode during migrations

First a quick recap of what we are doing. The challenge was to shared nothing live migrate virtual machines to a host with different virtual switch names and VLAN IDs. We did so by adding dummy virtual switches to the target host. This made share nothing live migration possible. On arrival of the virtual machine on the target host, we immediately connect the virtual network adapters to the final virtual switch and set the correct VLAN IDs. That works very well. You drop 1 or at most 2 pings, this is as good as it gets.

This goes wrong under the following conditions:

  • The source virtual switch has QoS mode absolute.
  • Virtual network adapter connected to the source virtual switch has MinimumBandwidthAbsolute and/or MaximumBandwidth set.
  • The target virtual switch with QoS mode weighted

This will cause connectivity loss as you cannot set absolute values to a virtual network attached to a weighted virtual switch. So connecting the virtual to the new virtual switch just fails and you lose connectivity. Remember that the virtual machine is connected to a dummy virtual switch just to make the live migration work and we need to swap it over immediately. The VLAN ID does get set correctly actually. Let’s deal with this.

Steps to fix this issue

First of all, we adapt the script to check the QoS mode on the source virtual switches. If it is set to absolute we know we need to check for any settings of MinimumBandwidthAbsolute and MaximumBandwidth on the virtual adapters connected to those virtual switches. These changes are highlighted in the demo code below.

Secondly, we adapt the script to check every virtual network adapter for its bandwidth management settings. If we find configured MinimumBandwidthAbsolute and MaximumBandwidth values we set these to 0 and as such disable the bandwidth settings. This makes sure that connecting the virtual network adapters to the new virtual switch with QoS mode weighted will succeed. These changes are highlighted in the demo code below.

Finally, the complete script

#The source Hyper-V host
$SourceNode = 'NODE-A'
#The LUN where you want to storage migrate your VMs away from
$SourceRootPath = "C:\ClusterStorage\Volume1*"
#The source Hyper-V host

#The target Hypr-V host
$TargetNode = 'ZULU'
#The storage pathe where you want to storage migrate your VMs to
$TargetRootPath = "C:\ClusterStorage\Volume1"

$OldVirtualSwitch01 = 'vSwitch-VLAN500'
$OldVirtualSwitch02 = 'vSwitch-VLAN600'
$NewVirtualSwitch = 'ConvergedVirtualSwitch'
$VlanId01 = 500
$VlanId02 = 600
  
if ((Get-VMSwitch -name $OldVirtualSwitch01 ).BandwidthReservationMode -eq 'Absolute') { 
    $OldVirtualSwitch01QoSMode = 'Absolute'
}
if ((Get-VMSwitch -name $OldVirtualSwitch01 ).BandwidthReservationMode -eq 'Absolute') { 
    $OldVirtualSwitch02QoSMode = 'Absolute'
}
    
#Grab all the VM we find that have virtual disks on the source CSV - WARNING for W2K12 you'll need to loop through all cluster nodes.
$AllVMsOnRootPath = Get-VM -ComputerName $SourceNode | where-object { $_.HardDrives.Path -like $SourceRootPath }

#We loop through all VMs we find on our SourceRoootPath
ForEach ($VM in $AllVMsOnRootPath) {
    #We generate the final VM destination path
    $TargetVMPath = $TargetRootPath + "\" + ($VM.Name).ToUpper()
    #Grab the VM name
    $VMName = $VM.Name
    $VM.VMid
    $VMName

    #If the VM is still clusterd, get it removed form the cluster as live shared nothing migration will otherwise fail.
    if ($VM.isclustered -eq $True) {
        write-Host -ForegroundColor Magenta $VM.Name "is clustered and is being removed from cluster"
        Remove-ClusterGroup -VMId $VM.VMid -Force -RemoveResources
        Do { Start-Sleep -seconds 1 } While ($VM.isclustered -eq $True)
        write-Host -ForegroundColor Yellow $VM.Name "has been removed from cluster"
    }
    #If the VM checkpoint, notify the user of the script as this will cause issues after swicthing to the new virtual
    #switch on the target node. Live migration will fail between cluster nodes if the checkpoints references 1 or more
    #non existing virtual switches. These must be removed prior to of after completing the shared nothing migration.
    #The script does this after the migration automatically, not before as I want it to be untouched if the shared nothing
    #migration fails.

    $checkpoints = get-vmcheckpoint -VMName $VM.Name

    if ($Null -ne $checkpoints) {
        write-host -foregroundcolor yellow "This VM has checkpoints"
        write-host -foregroundcolor yellow "This VM will be migrated to the new host"
        write-host -foregroundcolor yellow "Only after a succesfull migration will ALL the checpoints be removed"
    }
    
    #Do the actual storage migration of the VM, $DestinationVMPath creates the default subfolder structure
    #for the virtual machine config, snapshots, smartpaging & virtual hard disk files.
    Move-VM -Name $VMName -ComputerName $VM.ComputerName -IncludeStorage -DestinationStoragePath $TargetVMPath -DestinationHost $TargetNode
    
    $MovedVM = Get-VM -ComputerName $TargetNode -Name $VMName

    $vNICOnOldvSwitch01 = Get-VMNetworkAdapter -ComputerName $TargetNode -VMName $MovedVM.VMName | where-object SwitchName -eq $OldVirtualSwitch01
    if ($Null -ne $vNICOnOldvSwitch01) {
        foreach ($VMNetworkadapater in $vNICOnOldvSwitch01) {   
            if ($OldVirtualSwitch01QoSMode -eq 'Absolute') { 
                if (0 -ne $VMNetworkAdapter.bandwidthsetting.Maximumbandwidth) {
                    write-host -foregroundcolor cyan "Network adapter $VMNetworkAdapter.Name of VM $VMName MaximumBandwidth will be reset to 0."
                    Set-VMNetworkAdapter -Name $VMNetworkAdapter.Name -VMName $MovedVM.Name -ComputerName $TargetNode -MaximumBandwidth 0
                }
                if (0 -ne $VMNetworkAdapter.bandwidthsetting.MinimumBandwidthAbsolute) {
                    write-host -foregroundcolor cyan "Network adapter $VMNetworkAdapter.Name of VM $VMName MaximuBandwidthAbsolute will be reset to 0."
                    Set-VMNetworkAdapter -Name $VMNetworkAdapter.Name -VMName $MovedVM.Name -ComputerName $TargetNode -MinimumBandwidthAbsolute 0
                }
            }
                
            write-host 'Moving to correct vSwitch'
            Connect-VMNetworkAdapter -VMNetworkAdapter $vNICOnOldvSwitch01 -SwitchName $NewVirtualSwitch
            write-host "Setting VLAN $VlanId01"
            Set-VMNetworkAdapterVlan  -VMNetworkAdapter $vNICOnOldvSwitch01 -Access -VLANid $VlanId01
        }
    }

    $vNICsOnOldvSwitch02 = Get-VMNetworkAdapter -ComputerName $TargetNode -VMName $MovedVM.VMName | where-object SwitchName -eq $OldVirtualSwitch02
    if ($NULL -ne $vNICsOnOldvSwitch02) {
        foreach ($VMNetworkadapater in $vNICsOnOldvSwitch02) {
            if ($OldVirtualSwitch02QoSMode -eq 'Absolute') { 
                if ($Null -ne $VMNetworkAdapter.bandwidthsetting.Maximumbandwidth) {
                    write-host -foregroundcolor cyan "Network adapter $VMNetworkAdapter.Name of VM $VMName MaximumBandwidth will be reset to 0."
                    Set-VMNetworkAdapter -Name $VMNetworkAdapter.Name -VMName $MovedVM.Name -ComputerName $TargetNode -MaximumBandwidth 0
                }
                if ($Null -ne $VMNetworkAdapter.bandwidthsetting.MinimumBandwidthAbsolute) {
                    write-host -foregroundcolor cyan "Network adapter $VMNetworkAdapter.Name of VM $VMName MaximumBandwidth will be reset to 0."
                    Set-VMNetworkAdapter -Name $VMNetworkAdapter.Name -VMName $MovedVM.Name -ComputerName $TargetNode -MinimumBandwidthAbsolute 0
                }
            }
            write-host 'Moving to correct vSwitch'
            Connect-VMNetworkAdapter -VMNetworkAdapter $vNICsOnOldvSwitch02 -SwitchName $NewVirtualSwitch
            write-host "Setting VLAN $VlanId02"
            Set-VMNetworkAdapterVlan  -VMNetworkAdapter $vNICsOnOldvSwitch02 -Access -VLANid $VlanId02
        }
    }

    #If the VM has checkpoints, this is when we remove them.
    $checkpoints = get-vmcheckpoint -ComputerName $TargetNode -VMName $MovedVM.VMName

    if ($Null -ne $checkpoints) {
        write-host -foregroundcolor yellow "This VM has checkpoints and they will ALL be removed"
        $CheckPoints | Remove-VMCheckpoint 
    }
}

Below is the output of a VM where we had to change the switch name, enable a VLAN ID, deal with absolute QoS settings and remove checkpoints. All this without causing downtime. Nor did we change the original virtual machine in case the shared nothing migration fails.

Some observations

The fact that we are using PowerShell is great. You can only set weighted bandwidth limits via PowerShell. The GUI is only for absolute values and it will throw an error trying to use it when the virtual switch is configured as weighted.

This means you can embed setting the weights in your script if you so desire. If you do, read up on how to handle this best. trying to juggle the weight settings to be 100 in a dynamic environment is a bit of a challenge. So use the default flow pool and keep the number of virtual network adapters with unique settings to a minimum.

Conclusion

To avoid downtime we removed all the set minimum and maximum bandwidth settings on any virtual network adapter. By doing so we ensured that the swap to the new virtual switch right after the successful shared nothing live migration will succeed. If you want you can set weights on the virtual network adapters afterward. But as the bandwidth on these new hosts is now a redundant 25 Gbps, the need was no longer there. As a result, we just left them without. this can always be configured later if it turns out to be needed.

Warning: this is a demo script. It lacks error handling and logging. It can also contain mistakes. But hey you get it for free to adapt and use. Test and adapt this in a lab. You are responsible for what you do in your environments. Running scripts downloaded from the internet without any validation make you a certified nut case. That is not my wrongdoing.

I hope this helps some of you. thanks for reading.