Correcting the permissions on the folder with VHDS files & checkpoints for host level Hyper-V guest cluster backups

Introduction

It’s not a secret that while guest clustering with VHDSets works very well. We’ve had some struggles in regards to host level backups however. Right now I leverage Veeam Agent for Windows (VAW) to do in guest backups. The most recent versions of VAW support Windows Failover Clustering. I’d love to leverage host level backups but I was struggling to make this reliable for quite a while. As it turned out recently there are some virtual machine permission issues involved we need to fix. Both Microsoft and Veeam have published guidance on this in a KB article. We automated correcting the permissions on the folder with VHDS files & checkpoints for host level Hyper-V guest cluster backup

The KB articles

Early August Microsoft published KB article with all the tips when thins fail Errors when backing up VMs that belong to a guest cluster in Windows. Veeam also recapitulated on the needed conditions and setting to leverage guest clustering and performing host level backups. The Veeam article is Backing up Hyper-V guest cluster based on VHD set. Read these articles carefully and make sure all you need to do has been done.

For some reason another prerequisite is not mentioned in these articles. It is however discussed in ConfigStoreRootPath cluster parameter is not defined and here https://docs.microsoft.com/en-us/powershell/module/hyper-v/set-vmhostcluster?view=win10-ps You will need to set this to make proper Hyper-V collections needed for recovery checkpoints on VHD Sets. It is a very unknown setting with very little documentation.

But the big news here is fixing a permissions related issue!

The latest addition in the list of attention points is a permission issue. These permissions are not correct by default for the guest cluster VMs shared files. This leads to the hard to pin point error.

Error Event 19100 Hyper-V-VMMS 19100 ‘BackupVM’ background disk merge failed to complete: General access denied error (0x80070005). To fix this issue, the folder that holds the VHDS files and their snapshot files must be modified to give the VMMS process additional permissions. To do this, follow these steps for correcting the permissions on the folder with VHDS files & checkpoints for host level Hyper-V guest cluster backup.

Determine the GUIDS of all VMs that use the folder. To do this, start PowerShell as administrator, and then run the following command:

get-vm | fl name, id
Output example:
Name : BackupVM
Id : d3599536-222a-4d6e-bb10-a6019c3f2b9b

Name : BackupVM2
Id : a0af7903-94b4-4a2c-b3b3-16050d5f80f

For each VM GUID, assign the VMMS process full control by running the following command:
icacls <Folder with VHDS> /grant “NT VIRTUAL MACHINE\<VM GUID>”:(OI)F

Example:
icacls “c:\ClusterStorage\Volume1\SharedClusterDisk” /grant “NT VIRTUAL MACHINE\a0af7903-94b4-4a2c-b3b3-16050d5f80f2”:(OI)F
icacls “c:\ClusterStorage\Volume1\SharedClusterDisk” /grant “NT VIRTUAL MACHINE\d3599536-222a-4d6e-bb10-a6019c3f2b9b”:(OI)F

My little PowerShell script

As the above is tedious manual labor with a lot of copy pasting. This is time consuming and tedious at best. With larger guest clusters the probability of mistakes increases. To fix this we write a PowerShell script to handle this for us.

#Didier Van Hoye
#Twitter: @WorkingHardInIT 
#Blog: https://blog.Workinghardinit.work
#Correct shared VHD Set disk permissions for all nodes in guests cluster

$GuestCluster = "DemoGuestCluster"
$HostCluster = "LAB-CLUSTER"

$PathToGuestClusterSharedDisks = "C:\ClusterStorage\NTFS-03\GuestClustersSharedDisks"


$GuestClusterNodes = Get-ClusterNode -Cluster $GuestCluster

ForEach ($GuestClusterNode in $GuestClusterNodes)
{

#Passing the cluster name to -computername only works in W2K16 and up.
#As this is about VHDS you need to be running 2016, so no worries here.
$GuestClusterNodeGuid = (Get-VM -Name $GuestClusterNode.Name -ComputerName $HostCluster).id

Write-Host $GuestClusterNodeGuid "belongs to" $GuestClusterNode.Name

$IcalsExecute = """$PathToGuestClusterSharedDisks""" + " /grant " + """NT VIRTUAL MACHINE\"+ $GuestClusterNodeGuid + """:(OI)F"
write-Host "Executing " $IcalsExecute
CMD.EXE /C "icacls $IcalsExecute"

} 

Below is an example of the output of this script. It provides some feedback on what is happening.

Correcting the permissions on the folder with VHDS files & checkpoints for host level Hyper-V guest cluster backup

Correcting the permissions on the folder with VHDS files & checkpoints for host level Hyper-V guest cluster backup

PowerShell for the win. This saves you some searching and typing and potentially making some mistakes along the way. Have fun. More testing is underway to make sure things are now predictable and stable. We’ll share our findings with you.

Logging Cluster Aware Updating Hotfix Plug-in Installations To A File Share

As an early adopter of Windows Server 2012 it’s not about being the fist it’s about using the great new features. When you leverage the Cluster Aware Updating (CAU) Plug-in to deploy hardware vendor updates like those from DELL which are called DUPs (Dell Update Packages) you have the option to to log the process via parameter /L

This looks like this in the config XML file for the CAU (I’ll address this XML file in more details later).

<Folder name="Optiplex980DUPS" alwaysReboot="false"> 
    <Template path="$update$" parameters="/S /L=\zuluCAULoggingCAULog.log"/>

 

As you can see I use a file share as I don’t want to log locally because this would mean I’d have to collect the logs on all nodes of a cluster.   Now if you log to  file share you need to do two things that we’ll discuss below.

1. Set up a share where you can write the log or logs to

Please note that you cannot and should not use the CAU file share for this. First off all only a few accounts are allows to have write permissions to the CAU file share. This is documented in How CAU Plug-ins Work

Only certain security principals are permitted (but are not required) to have Write or Modify permission. The allowed principals are the local Administrators group, SYSTEM, CREATOR OWNER, and TrustedInstaller. Other accounts or groups are not permitted to have Write or Modify permission on the hotfix root folder.

This makes sense. SMB Signing and Encryption are used to protect tampering with the files in transit and to make sure you talk to the one an only real CAU file share. To protect the actual content of that share you need to make sure now one but some trusted accounts and a select group of trusted administrators can add installers to the share. If not you might be installing malicious content to your cluster nodes without you ever realizing. Perhaps some auditing on that folder structure might be a good idea?

image_thumb61

This means that you need a separate file share so you can add modify or at least write permissions to the necessary accounts on the folder. Which brings us to the second thing you need to do.

2. Set up Write or Modify permissions on the log share

You’ll need to set up Write or Modify permissions on the log share for all cluster node computer accounts. To make this work more practically with larger clusters please you can add the computer accounts to an AD group, which makes for easier administration).

image_thumb61

The two nodes here have permissions to write to the location

image

As you can see the first node to create the loge file is the owner:

image

Some extra tips

The log can grow quite large if used a lot. Keep an eye on it so avoid space issues or so it doesn’t get too big to handle and be useful. And for clarities sake you might get a different log per cluster or even folder type. You can customize to your needs.