Yes, Windows Server 2012 R2. Me, the most vocal proponent of keeping your environments up to date, to the level I barely have a Windows Server 2022 under my care anymore.
So what gives? Some people spun up a brand-new Windows Server 2025 Hyper-V cluster and migrated a truckload of their virtual machines over. I love modern infrastructure, so this all sounds very good until they reach out with a little issue. About a dozen of their virtual machines do not boot properly, but into the recovery console. My first question was what is the OS on those virtual machines? When the answer is Windows Server 2012 R2, maybe some Windows Server 2016, I had heard all I needed to know to help “fix” this. The real solution is not running those old out-of-support OS versions anymore, but we can “fix” it so your apps keep running while you upgrade or migrate.
Symptoms
Their older but business-critical Windows Server 2012 R2 VMs—and these were Generation 2, UEFI VMs, no less, but did not boot on their shiny new Hyper-V cluster. The migration itself went smoothly, they said, but when they started the virtual machines, the apps did not come up. So they checked the console logs of those virtual machines and saw STOP 0x0000007B: INACCESSIBLE_BOOT_DEVICE errors and recovery consoles. Rebooting did not help at all. This was a solid, reproducible crash loop, exactly at the point where the OS should hand off the bootloader to the kernel and find its disk. If you’ve been in the game for a while, you know that this usually spells one thing: a fundamental storage or bus driver issue. But why now?
The ACPI Identity Crisis
Windows Server 2012 (R2) and Windows Server 2016 are not supported on Windows Server 2025 Hyper-V. Upgrade or migrate before you move them.
This wasn’t just some random corruption. We were looking at a fundamental compatibility issue. To understand why, you need to understand how Hyper-V and the Guest OS communicate during the boot process.
Server 2012 R2 came out in 2013. Hyper-V 2025 is the latest of the greatest at the time of writing. In the decade between those releases, the “hardware signatures” (Hardware IDs, or HWIDs) that Hyper-V presents to a virtual machine have evolved.
In Gen 2 VMs, Windows relies heavily on the ACPI (Advanced Configuration and Power Interface) tables to find its critical components, especially the virtual machine bus (VMBus) and the storage controllers that attach to it.
When 2012 R2 boots, the kernel says, “Okay, ACPI, show me my storage bus.”
The Hyper-V 2025 host says, “Here is your storage bus, its ID is MSFT1000.”
The 2012 R2 kernel looks in its driver database and goes, “MSFT1000? I have no idea who that is. I’m looking for VMBus or nothing.”
Boom. It can’t see the bus, it can’t load the disk driver, and it can’t find itself so it suffers an Inaccessible Boot Device crash. As the guest has no clue what to do.
The “fix” is in some offline registry editing
Since the VM was in a crash loop and couldn’t boot to Windows, we had to perform some offline registry surgery. Luckily for them, the virtual machines could boot into their recovery environments, so they did not have to boot from an ISO to reach a command prompt and access the offline system hive.
We used a combination of reg load to “mount” the system registry from the VM’s disk onto our repair environment, and then some strategic reg copy commands to “spoof” the IDs.
Step by step
Mounting the Hive: Code snippet reg load HKLM\TempHive c:\Windows\system32\config\SYSTEM
(This assumes c: is where the VM’s Windows volume is mounted.)
Map the VMBus to MSFT1000 Code snippet reg copy HKLM\TempHive\ControlSet001\Enum\ACPI\VMBus HKLM\TempHive\ControlSet001\Enum\ACPI\MSFT1000 /s
This is the core fix. We are telling the 2012 R2 system: “Look, if you ever see a device calling itself MSFT1000, don’t ignore it. Duplicate every single setting, driver binding, and service permission you have for ‘VMBus’ and apply it to this new ‘MSFT1000’ identity.” This essentially links the modern host’s ID to the older OS’s native VMBus driver stack.
Mapping the Generation Counter to MSFT1002 Code snippet reg copy HKLM\TempHive\ControlSet001\Enum\ACPI\Hyper_V_Gen_Counter_V1 HKLM\TempHive\ControlSet001\Enum\ACPI\MSFT1002 /s This maps the older Hyper_V_Gen_Counter_V1 identity—used for snapshots and consistency—to its modern equivalent on the 2025 host, MSFT1002. This is crucial for making sure integration services load properly.
After these commands, we had to run reg unload HKLM\TempHive to save our changes. We removed the ISO, rebooted, and… Bingo. The Server 2012 R2 boot screen appeared, and the login prompt followed shortly after.
This works because Server 2012 R2 has the necessary VMBus and storage drivers; it just doesn’t know they are compatible with the hardware IDs reported by Hyper-V 2025. This registry trick just creates that necessary driver-to-hardware binding.
But remember that this is an “unsupported” hack! While this gets the VM booting, moving 2012 R2 to newer hosts often means features might be degraded. Microsoft deprecated official support for 2012 R2 guests on modern hosts a while ago. Windows Server 2016 RTM without modern patching will suffer from the same issue, by the way.
Below is a complete script you can copy and paste into CMD.exe in your recovery environment to fix a virtual machine with this issue.
@echo off
echo.
echo ============================================================
echo Hyper-V 2025 ACPI Fix for Windows Server 2012 R2 / 2016 RTM
echo - Adds MSFT1000 (VMBus) and MSFT1002 (GenCounter)
echo - Auto-detects ControlSet
echo - Creates SYSTEM hive backup
echo ============================================================
echo.
:: --- Step 1: Detect Windows drive ---
echo Detecting Windows installation drive...
set WINDRV=
for %%d in (C D E F G H I J K L M N O P Q R S T U V W X Y Z) do (
if exist %%d:\Windows\System32\Config\SYSTEM (
set WINDRV=%%d:
)
)
if "%WINDRV%"=="" (
echo ERROR: Could not find Windows installation drive.
echo Aborting.
exit /b 1
)
echo Windows installation found on %WINDRV%
echo.
:: --- Step 2: Backup SYSTEM hive ---
echo Creating SYSTEM hive backup...
copy "%WINDRV%\Windows\System32\Config\SYSTEM" "%WINDRV%\Windows\System32\Config\SYSTEM.bak"
if errorlevel 1 (
echo ERROR: Backup failed. Aborting.
exit /b 1
)
echo Backup created: SYSTEM.bak
echo.
:: --- Step 3: Load SYSTEM hive ---
echo Loading SYSTEM hive into HKLM\TempHive...
reg load HKLM\TempHive "%WINDRV%\Windows\System32\Config\SYSTEM"
if errorlevel 1 (
echo ERROR: Failed to load SYSTEM hive. Aborting.
exit /b 1
)
echo Hive loaded.
echo.
:: --- Step 4: Detect active ControlSet ---
echo Detecting active ControlSet...
for /f "tokens=3" %%a in ('reg query HKLM\TempHive\Select /v Current') do set CS=ControlSet00%%a
if "%CS%"=="" (
echo ERROR: Could not determine active ControlSet.
reg unload HKLM\TempHive
exit /b 1
)
echo Active ControlSet: %CS%
echo.
:: --- Step 5: Apply ACPI fixes ---
echo Applying ACPI fixes...
echo - Cloning VMBus -> MSFT1000
reg copy HKLM\TempHive\%CS%\Enum\ACPI\VMBus HKLM\TempHive\%CS%\Enum\ACPI\MSFT1000 /s /f
echo - Cloning Hyper_V_Gen_Counter_V1 -> MSFT1002
reg copy HKLM\TempHive\%CS%\Enum\ACPI\Hyper_V_Gen_Counter_V1 HKLM\TempHive\%CS%\Enum\ACPI\MSFT1002 /s /f
echo ACPI fixes applied.
echo.
:: --- Step 6: Unload hive ---
echo Unloading SYSTEM hive...
reg unload HKLM\TempHive
echo Hive unloaded.
echo.
echo ============================================================
echo FIX COMPLETE
echo You may now reboot the VM.
echo ============================================================
echo.
pause
Better to do this proactively; I have a PowerShell solution on GitHub that also includes the above .cmd script. The Invoke-TestAndFixHyperV2025ReadinessForLegacyVMs.ps1 script can handle virtual machines that are online, before you move them to Hyper-V 2025. https://github.com/WorkingHardInIT/Invoke-TestAndFixHyperV2025ReadinessForLegacyVMs
But I cannot migrate or upgrade yet!
I call bull shit on most of these in 99% of cases. And if it is not bullshit you really need to get your act together and work on fixing your apps/vendors to never allow getting into such a mess in the first place.
Conclusion
Tech debt. You know that thing every IT manager and department is preventing or solving for the last 30 years is still very much around. Despite all that ITIL, risk and change management, or maybe even due to all that talk and very little action.
Sometimes, saving the day isn’t about deploying the latest and greatest tech; it’s about diving into the deepest, darkest corners of the OS and tricking it into working just one more time. There are no guarantees, and this is a ticking time bomb.
I bought these people some time. Now they need to get working! I also kindly suggested they should read their backup vendors’ support statements 😉.
I was tasked to troubleshoot a cluster where cluster aware updating (CAU) failed due to the nodes never succeeding going into maintenance mode. It seemed that none of the obvious or well know issues and mistakes that might break live migrations were present. Looking at the cluster and testing live migration not a single VM on any node would live migrate to any other node. So, I take a peek the event id and description and it hits me. I have seen this particular event id before.
Log Name: System Source: Microsoft-Windows-Hyper-V-High-Availability Date: 9/27/2018 15:36:44 Event ID: 21502 Task Category: None Level: Error Keywords: User: SYSTEM Computer: NODE-B.datawisetech.corp Description: Live migration of ‘Virtual Machine ADFS1’ failed. Virtual machine migration operation for ‘ADFS1’ failed at migration source ‘NODE-B’. (Virtual machine ID 4B5F2F6C-AEA3-4C7B-8342-E255D1D112D7) Failed to verify collection registry for virtual machine ‘ADFS1’: The system cannot find the file specified. (0x80070002). (Virtual Machine ID 4B5F2F6C-AEA3-4C7B-8342-E255D1D112D7).The live migration fails due to non-existent SharedStoragePath or ConfigStoreRootPath which is where collections metadata lives.
More errors are logged
There usually are more related tell-tale events. They however are clear in pin pointing the root cause.
On the destination host
On the destination host you’ll find event id 21066:
Log Name: Microsoft-Windows-Hyper-V-VMMS-Admin Source: Microsoft-Windows-Hyper-V-VMMS Date: 9/27/2018 15:36:45 Event ID: 21066 Task Category: None Level: Error Keywords: User: SYSTEM Computer: NODE-A.datawisetech.corp Description: Failed to verify collection registry for virtual machine ‘ADFS1’: The system cannot find the file specified. (0x80070002). (Virtual Machine ID 4B5F2F6C-AEA3-4C7B-8342-E255D1D112D7).
A bunch of 1106 events per failed live migration per VM in like below:
Log Name: Microsoft-Windows-Hyper-V-VMMS-Operational Source: Microsoft-Windows-Hyper-V-VMMS Date: 9/27/2018 15:36:45 Event ID: 1106 Task Category: None Level: Error Keywords: User: SYSTEM Computer: NODE-A.datawisetech.corp Description: vm\service\migration\vmmsvmmigrationdestinationtask.cpp(5617)\vmms.exe!00007FF77D2171A4: (caller: 00007FF77D214A5D) Exception(998) tid(1fa0) 80070002 The system cannot find the file specified.
As well as event id 21111: Log Name: Microsoft-Windows-Hyper-V-High-Availability-Admin Source: Microsoft-Windows-Hyper-V-High-Availability Date: 9/27/2018 15:36:44 Event ID: 21111 Task Category: None Level: Error Keywords: User: SYSTEM Computer: NODE-B.datawisetech.corp Description: Live migration of ‘Virtual Machine ADFS1’ failed.
… event id 21066: Log Name: Microsoft-Windows-Hyper-V-VMMS-Admin Source: Microsoft-Windows-Hyper-V-VMMS Date: 9/27/2018 15:36:44 Event ID: 21066 Task Category: None Level: Error Keywords: User: SYSTEM Computer: NODE-B.datawisetech.corp Description: Failed to verify collection registry for virtual machine ‘ADFS1’: The system cannot find the file specified. (0x80070002). (Virtual Machine ID 4B5F2F6C-AEA3-4C7B-8342-E255D1D112D7).
… and event id 21024: Log Name: Microsoft-Windows-Hyper-V-VMMS-Admin Source: Microsoft-Windows-Hyper-V-VMMS Date: 9/27/2018 15:36:44 Event ID: 21024 Task Category: None Level: Error Keywords: User: SYSTEM Computer: NODE-B.datawisetech.corp Description: Virtual machine migration operation for ‘ADFS1’ failed at migration source ‘NODE-B’. (Virtual machine ID 4B5F2F6C-AEA3-4C7B-8342-E255D1D112D7)
Live migration fails due to non-existent SharedStoragePath or ConfigStoreRootPath explained
This is what a Windows Server 2016/2019 cluster that has not been configured with a looks like.
Get-VMHostCluster -ClusterName “W2K19-LAB”
HKLM\Cluster\Resources\GUIDofWMIResource\Parameters there is a value called ConfigStoreRootPath which in PowerShell is know as the SharedStoragePath property. You can also query it via
And this is what it looks like in the registry (0.Cluster and Cluster keys.) The resource ID we are looking at is the one of the Virtual Machine Cluster WMI resource.
If it returns a path you must verify that it exists, if not you’re in trouble with live migrations. You will also be in trouble with host level guest cluster backups or Hyper-V replicas of them. Maybe you don’t have guest cluster or use in guest backups and this is just a remnant of trying them out.
When I run it on the problematic cluster I get a path points to a folder on a CSV that doesn’t exist.
Did they rename the CSV? Replace the storage array? Well as it turned out they reorganized and resized the CSVs. As they can’t shrink SAN LUNs the created new ones. They then leveraged storage live migration to move the VMs.
The old CSV’s where left in place for about 6 weeks before they were cleaned up. As this was the first time they ran Cluster Aware Updating after removing them this is the first time they hit this problem. Bingo! You probably think you’ll just change it to an existing CSV folder path or delete it. Well as it turns out, you cannot do that. You can try …
Set-VMHostCluster : The operation on computer ‘W2K19-LAB’ failed: The WS-Management service cannot process the request. The WMI service or the WMI provider returned an unknown error: HRESULT 0x80070032 At line:1 char:1 + Set-VMHostCluster -ClusterName “W2K19-LAB” -SharedStoragePath “C:\Clu … + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + CategoryInfo : NotSpecified: (:) [Set-VMHostCluster], VirtualizationException + FullyQualifiedErrorId : OperationFailed,Microsoft.HyperV.PowerShell.Commands.SetVMHostCluster
Whatever you try, deleting, overwriting, … no joy. As it turns out you cannot change it and this is by design. A shaky design I would say. I understand the reasons because if it changes or is deleted and you have guest clusters with collection depending on what’s in there you have backup and live migration issues with the guest clusters. But if you can’t change it you also run into issues if storage changes. You dammed if you do, dammed if you don’t.
Workaround 1
What
Create a CSV with the old name and folder(s) to which the current path is pointing. That works. It could even be a very small one. As test I use done of 1GB. Not sure of that’s enough over time but if you can easily extend your CSV that’s should not pose a problem. It might actually be a good idea to have this as a best practice. Have a dedicated CSV for the SharedStoragePath. I’ll need to ask Microsoft.
How
You know how to create a CSV and a folder I guess, that’s about it. I’ll leave it at that.
Workaround 2
What
Set the path to a new one in the registry. This could be a new path (mind you this won’t fix any problems you might already have now with existing guest clusters).
Delete the value for that current path and leave it empty. This one is only a good idea if you don’t have a need for VHD Set Guest clusters anymore. Basically, this is resetting it to the default value.
How
There are 2 ways to do this. Both cost down time. You need to bring the cluster service down on all nodes and then you don’t have your CSV’s. That means your VMs must be shut down on all nodes of the cluster
The Microsoft Support way
Well that’s what they make you do (which doesn’t mean you should just do it even without them instructing you to do so)
Export your HKLM\Cluster\Resources\GUIDofWMIResource\Parameters for save keeping and restore if needed.
Shut down all VMs in the cluster or even the ones residing on a CSV even if not clusterd.
Stop the cluster service on all nodes (the cluster is shutdown if you do that), leave the one you are working on for last.
From one node, open up the registry key
Click on HKEY_LOCAL_MACHINE and then click on file, then select load hive
Browse to c:\windows\cluster, and select CLUSDB
Click ok, and then name it DB
Expand DB, then expand Resources
Select the GUID of Virtual Machine WMI
Click on parameters, on (configStoreRootPath) you will find the value
Double click on it, and delete it or set it to a new path on a CSV that you created already
Start the cluster service
Then start the cluster service from all nodes, node by node
My way
Not supported, at your own risk, big boy rules apply. I have tried and tested this a dozen times in the lab on multiple clusters and this also works.
In the registry key Cluster (HKLM\Cluster\Resources\GUIDofWMIResource\Parameters) of ever cluster node delete the content of the REGZ value for configStoreRootPath, so it is empty or change it to a new path on a CSV that you created already for this purpose.
If you have a cluster with a disk witness, the node who owns the disk witness also has a 0.Cluster key (HKLM\0.Cluster\Resources\GUIDofWMIResource\Parameters). Make sure you also to change the value there.
When you have done this. You have to shut down all the virtual machines. You then stop the cluster service on every node. I try to work on the node owning the disk witness and shut down the cluster on that one as the final step. This is also the one where I start again the cluster again first so I can easily check that the value remains empty in both the Cluster and the 0.Cluster keys. Do note that with a file share / cloud share witness, knowing what node was shut down last can be important. See https://blog.workinghardinit.work/2017/12/11/cluster-shared-volumes-without-active-directory/. That’s why I always remember what node I’m working on and shut down last.
Start up the cluster service on the other nodes one by one.
This avoids having to load the registry hive but editing the registry on every node in large clusters is tedious. Sure, this can be scripted in combination with shutting down the VMs, stopping the cluster service on all nodes, changing the value and then starting the cluster services again as well as the VMs. You can control the order in which you go through the nodes in a script as well. I actually did script this but I used my method. you can find it at the bottom of this blog post.
Both methods will work and live migrations will work again. Any existing problematic guest cluster VMs in backup or live migration is food for another blog post perhaps. But you’ll have things like driving your crazy.
Some considerations
Workaround 1 is a bit of a “you got to be kidding me” solution but at least it leaves some freedom replace, rename, reorganize the other CSVs as you see fit. So perhaps having a dedicated CSV just for this purpose is not that silly. Another benefit is that this does not involve messing around in the cluster database via the registry. This is something we advise against all the time but now has become a way to get out of a pickle.
Workaround 2 speaks for its self. There is two ways to achieve this which I have shown. But a word of warning. The moment the path changes and you have some already existing VHD Set guests clusters that somehow depend on that you’ll see that backups start having issues and possibly even live migrations. But you’re toast for all your Live migrations anyway already so … well yeah, what can I do.
So, this is by design. Maybe it is but it isn’t very realistic that your stuck to a path and name that hard and that it causes this much grief or allows for people to shoot themselves in the foot. It’s not like all this documented somewhere.
Conclusion
This needs to be fixed. While I can get you out of this pickle it is a tedious operation with some risk in a production environment. It also requires down time, which is bad. On top of that it will only have a satisfying result if you don’t have any VHD Set guest clusters that rely on the old path. The mechanism behind the SharedStoragePath isn’t as robust and flexible yet as it should be when it comes to changes & dealing with failed host level guest cluster backups.
I have tested this in Windows 2019 insider preview. The issue is still there. No progress on that front. Maybe in some of the future cumulative updates, things will be fixed to make guest clustering with VHD Set a more robust and reliable solution. The fact that Microsoft relies on guest clustering to support some deployment scenarios with S2D makes this even more disappointing. It is also a reason I still run physical shared storage-based file clusters.
The problematic host level backups I can work around by leveraging in guest backups. But the path issue is unavoidable if changes are needed.
After 2 years of trouble with the framework around guest cluster backups / VHD Set, it’s time this “just works”. No one will use it when it remains this troublesome and you won’t fix this if no one uses this. The perfect catch 22 situation.
The Script
$ClusterName = "W2K19-LAB"
$OwnerNodeWitnessDisk = $Null
$RemberLastNodeThatWasShutdown = $Null
$LogFileName = "ConfigStoreRootPathChange"
$RegistryPathCluster = "HKLM:\Cluster\Resources\$WMIClusterResourceID\Parameters"
$RegistryPathClusterDotZero = "HKLM:\0.Cluster\Resources\$WMIClusterResourceID\Parameters"
$REGZValueName = "ConfigStoreRootPath"
$REGZValue = $Null #We need to empty the value
#$REGZValue = "C:\ClusterStorage\ReFS-01\SharedPath" #We need to set a new path.
#Region SupportingFunctionsAndWorkFlows
Workflow ShutDownVMs {
param ($AllVMs)
Foreach -parallel ($VM in $AllVMs) {
InlineScript {
try {
If ($using:VM.State -eq "Running") {
Stop-VM -Name $using:VM.Name -ComputerName $using:VM.ComputerName -force
}
}
catch {
$ErrorMessage = $_.Exception.Message
$ErrorLine = $_.InvocationInfo.Line
$ExceptionInner = $_.Exception.InnerException
Write-2-Log -Message "!Error occured!:" -Severity Error
Write-2-Log -Message $ErrorMessage -Severity Error
Write-2-Log -Message $ExceptionInner -Severity Error
Write-2-Log -Message $ErrorLine -Severity Error
Write-2-Log -Message "Bailing out - Script execution stopped" -Severity Error
}
}
}
}
#Code to shut down all VMs on all Hyper-V cluster nodes
Workflow StartVMs {
param ($AllVMs)
Foreach -parallel ($VM in $AllVMs) {
InlineScript {
try {
if ($using:VM.State -eq "Off") {
Start-VM -Name $using:VM.Name -ComputerName $using:VM.ComputerName
}
}
catch {
$ErrorMessage = $_.Exception.Message
$ErrorLine = $_.InvocationInfo.Line
$ExceptionInner = $_.Exception.InnerException
Write-2-Log -Message "!Error occured!:" -Severity Error
Write-2-Log -Message $ErrorMessage -Severity Error
Write-2-Log -Message $ExceptionInner -Severity Error
Write-2-Log -Message $ErrorLine -Severity Error
Write-2-Log -Message "Bailing out - Script execution stopped" -Severity Error
}
}
}
}
function Write-2-Log {
[CmdletBinding()]
param(
[Parameter()]
[ValidateNotNullOrEmpty()]
[string]$Message,
[Parameter()]
[ValidateNotNullOrEmpty()]
[ValidateSet('Information', 'Warning', 'Error')]
[string]$Severity = 'Information'
)
$Date = get-date -format "yyyyMMdd"
[pscustomobject]@{
Time = (Get-Date -f g)
Message = $Message
Severity = $Severity
} | Export-Csv -Path "$PSScriptRoot\$LogFileName$Date.log" -Append -NoTypeInformation
}
#endregion
Try {
Write-2-Log -Message "Connecting to cluster $ClusterName" -Severity Information
$MyCluster = Get-Cluster -Name $ClusterName
$WMIClusterResource = Get-ClusterResource "Virtual Machine Cluster WMI" -Cluster $MyCluster
Write-2-Log -Message "Grabbing Cluster Resource: Virtual Machine Cluster WMI" -Severity Information
$WMIClusterResourceID = $WMIClusterResource.Id
Write-2-Log -Message "The Cluster Resource Virtual Machine Cluster WMI ID is $WMIClusterResourceID " -Severity Information
Write-2-Log -Message "Checking for quorum config (disk, file share / cloud witness) on $ClusterName" -Severity Information
If ((Get-ClusterQuorum -Cluster $MyCluster).QuorumResource -eq "Witness") {
Write-2-Log -Message "Disk witness in use. Lookin up for owner node of witness disk as that holds the 0.Cluster registry key" -Severity Information
#Store the current owner node of the witness disk.
$OwnerNodeWitnessDisk = (Get-ClusterGroup -Name "Cluster Group").OwnerNode
Write-2-Log -Message "Owner node of witness disk is $OwnerNodeWitnessDisk" -Severity Information
}
}
Catch {
$ErrorMessage = $_.Exception.Message
$ErrorLine = $_.InvocationInfo.Line
$ExceptionInner = $_.Exception.InnerException
Write-2-Log -Message "!Error occured!:" -Severity Error
Write-2-Log -Message $ErrorMessage -Severity Error
Write-2-Log -Message $ExceptionInner -Severity Error
Write-2-Log -Message $ErrorLine -Severity Error
Write-2-Log -Message "Bailing out - Script execution stopped" -Severity Error
Break
}
try {
$ClusterNodes = $MyCluster | Get-ClusterNode
Write-2-Log -Message "We have grabbed the cluster nodes $ClusterNodes from $MyCluster" -Severity Information
Foreach ($ClusterNode in $ClusterNodes) {
#If we have a disk witness we also need to change the in te 0.Cluster registry key on the current witness disk owner node.
If ($ClusterNode.Name -eq $OwnerNodeWitnessDisk) {
if (Test-Path -Path $RegistryPathClusterDotZero) {
Write-2-Log -Message "Changing $REGZValueName in 0.Cluster key on $OwnerNodeWitnessDisk who owns the witnessdisk to $REGZvalue" -Severity Information
Invoke-command -computername $ClusterNode.Name -ArgumentList $RegistryPathClusterDotZero, $REGZValueName, $REGZValue {
param($RegistryPathClusterDotZero, $REGZValueName, $REGZValue)
Set-ItemProperty -Path $RegistryPathClusterDotZero -Name $REGZValueName -Value $REGZValue -Force | Out-Null}
}
}
if (Test-Path -Path $RegistryPathCluster) {
Write-2-Log -Message "Changing $REGZValueName in Cluster key on $ClusterNode.Name to $REGZvalue" -Severity Information
Invoke-command -computername $ClusterNode.Name -ArgumentList $RegistryPathCluster, $REGZValueName, $REGZValue {
param($RegistryPathCluster, $REGZValueName, $REGZValue)
Set-ItemProperty -Path $RegistryPathCluster -Name $REGZValueName -Value $REGZValue -Force | Out-Null}
}
}
Write-2-Log -Message "Grabbing all VMs on all clusternodes to shut down" -Severity Information
$AllVMs = Get-VM –ComputerName ($ClusterNodes)
Write-2-Log -Message "We are shutting down all running VMs" -Severity Information
ShutdownVMs $AllVMs
}
catch {
$ErrorMessage = $_.Exception.Message
$ErrorLine = $_.InvocationInfo.Line
$ExceptionInner = $_.Exception.InnerException
Write-2-Log -Message "!Error occured!:" -Severity Error
Write-2-Log -Message $ErrorMessage -Severity Error
Write-2-Log -Message $ExceptionInner -Severity Error
Write-2-Log -Message $ErrorLine -Severity Error
Write-2-Log -Message "Bailing out - Script execution stopped" -Severity Error
Break
}
try {
#Code to stop the cluster service on all cluster nodes
#ending with the witness owner if there is one
Write-2-Log -Message "Shutting down cluster service on all nodes in $MyCluster that are not the owner of the witness disk" -Severity Information
Foreach ($ClusterNode in $ClusterNodes) {
#First we shut down all nodes that do NOT own the witness disk
If ($ClusterNode.Name -ne $OwnerNodeWitnessDisk) {
Write-2-Log -Message "Stop cluster service on node $ClusterNode.Name" -Severity Information
if ((Get-ClusterNode -Cluster W2K19-LAB | where-object {$_.State -eq "Up"}).count -ne 1) {
Stop-ClusterNode -Name $ClusterNode.Name -Cluster $MyCluster | Out-Null
}
Else {
Stop-Cluster -Cluster $MyCluster -Force | Out-Null
$RemberLastNodeThatWasShutdown = $ClusterNode.Name
}
}
}
#Whe then shut down the nodes that owns the witness disk
#If we have a fileshare etc, this won't do anything.
Foreach ($ClusterNode in $ClusterNodes) {
If ($ClusterNode.Name -eq $OwnerNodeWitnessDisk) {
Write-2-Log -Message "Stopping cluster and as such last node $ClusterNode.Name" -Severity Information
Stop-Cluster -Cluster $MyCluster -Force | Out-Null
$RemberLastNodeThatWasShutdown = $OwnerNodeWitnessDisk
}
}
#Code to start the cluster service on all cluster nodes,
#starting with the original owner of the witness disk
#or the one that was shut down last
Foreach ($ClusterNode in $ClusterNodes) {
#First we start the node that was shut down last. This is either the one that owned the witness disk
#or just the last node that was shut down in case of a fileshare
If ($ClusterNode.Name -eq $RemberLastNodeThatWasShutdown) {
Write-2-Log -Message "Starting the clusternode $ClusterNode.Name that was the last to shut down" -Severity Information
Start-ClusterNode -Name $ClusterNode.Name -Cluster $MyCluster | Out-Null
}
}
Write-2-Log -Message "Starting the all other clusternodes in $MyCluster" -Severity Information
Foreach ($ClusterNode in $ClusterNodes) {
#We then start all the other nodes in the cluster.
If ($ClusterNode.Name -ne $RemberLastNodeThatWasShutdown) {
Write-2-Log -Message "Starting the clusternode $ClusterNode.Name" -Severity Information
Start-ClusterNode -Name $ClusterNode.Name -Cluster $MyCluster | Out-Null
}
}
}
catch {
$ErrorMessage = $_.Exception.Message
$ErrorLine = $_.InvocationInfo.Line
$ExceptionInner = $_.Exception.InnerException
Write-2-Log -Message "!Error occured!:" -Severity Error
Write-2-Log -Message $ErrorMessage -Severity Error
Write-2-Log -Message $ExceptionInner -Severity Error
Write-2-Log -Message $ErrorLine -Severity Error
Write-2-Log -Message "Bailing out - Script execution stopped" -Severity Error
Break
}
Start-sleep -Seconds 15
Write-2-Log -Message "Grabbing all VMs on all clusternodes to start them up" -Severity Information
$AllVMs = Get-VM –ComputerName ($ClusterNodes)
Write-2-Log -Message "We are starting all stopped VMs" -Severity Information
StartVMs $AllVMs
#Hit it again ...
$AllVMs = Get-VM –ComputerName ($ClusterNodes)
StartVMs $AllVMs
The script is below as promised. If you use this without testing in a production environment and it blows up in your face you are going to get fired and it is your fault. You can use it both to introduce as fix the issue. The action are logged in the directory where the script is run from.
It’s not a secret that while guest clustering with VHDSets works very well. We’ve had some struggles in regards to host level backups however. Right now I leverage Veeam Agent for Windows (VAW) to do in guest backups. The most recent versions of VAW support Windows Failover Clustering. I’d love to leverage host level backups but I was struggling to make this reliable for quite a while. As it turned out recently there are some virtual machine permission issues involved we need to fix. Both Microsoft and Veeam have published guidance on this in a KB article. We automated correcting the permissions on the folder with VHDS files & checkpoints for host level Hyper-V guest cluster backup
But the big news here is fixing a permissions related issue!
The latest addition in the list of attention points is a permission issue. These permissions are not correct by default for the guest cluster VMs shared files. This leads to the hard to pin point error.
Error Event 19100 Hyper-V-VMMS 19100 ‘BackupVM’ background disk merge failed to complete: General access denied error (0x80070005). To fix this issue, the folder that holds the VHDS files and their snapshot files must be modified to give the VMMS process additional permissions. To do this, follow these steps for correcting the permissions on the folder with VHDS files & checkpoints for host level Hyper-V guest cluster backup.
Determine the GUIDS of all VMs that use the folder. To do this, start PowerShell as administrator, and then run the following command:
get-vm | fl name, id
Output example:
Name : BackupVM
Id : d3599536-222a-4d6e-bb10-a6019c3f2b9b
Name : BackupVM2
Id : a0af7903-94b4-4a2c-b3b3-16050d5f80f
For each VM GUID, assign the VMMS process full control by running the following command:
icacls <Folder with VHDS> /grant “NT VIRTUAL MACHINE\<VM GUID>”:(OI)F
As the above is tedious manual labor with a lot of copy pasting. This is time consuming and tedious at best. With larger guest clusters the probability of mistakes increases. To fix this we write a PowerShell script to handle this for us.
#Didier Van Hoye
#Twitter: @WorkingHardInIT
#Blog: https://blog.Workinghardinit.work
#Correct shared VHD Set disk permissions for all nodes in guests cluster
$GuestCluster = "DemoGuestCluster"
$HostCluster = "LAB-CLUSTER"
$PathToGuestClusterSharedDisks = "C:\ClusterStorage\NTFS-03\GuestClustersSharedDisks"
$GuestClusterNodes = Get-ClusterNode -Cluster $GuestCluster
ForEach ($GuestClusterNode in $GuestClusterNodes)
{
#Passing the cluster name to -computername only works in W2K16 and up.
#As this is about VHDS you need to be running 2016, so no worries here.
$GuestClusterNodeGuid = (Get-VM -Name $GuestClusterNode.Name -ComputerName $HostCluster).id
Write-Host $GuestClusterNodeGuid "belongs to" $GuestClusterNode.Name
$IcalsExecute = """$PathToGuestClusterSharedDisks""" + " /grant " + """NT VIRTUAL MACHINE\"+ $GuestClusterNodeGuid + """:(OI)F"
write-Host "Executing " $IcalsExecute
CMD.EXE /C "icacls $IcalsExecute"
}
Below is an example of the output of this script. It provides some feedback on what is happening.
Correcting the permissions on the folder with VHDS files & checkpoints for host level Hyper-V guest cluster backup
PowerShell for the win. This saves you some searching and typing and potentially making some mistakes along the way. Have fun. More testing is underway to make sure things are now predictable and stable. We’ll share our findings with you.
When you’re using DELL Compellent (SC Series) storage you might be leveraging the Dell SC Series MPIO Registry Settings script they give you to set the recommended settings. That’s a nice little script you can test, verify and adapt to integrate into your set up scripts. You can find it in the Dell EMC SC Series Storage and Microsoft Multipath I/O
Dell SC Series MPIO Registry Settings script
Recently I was working with a new deployment ( 7.2.40) to test and verify it in a lab environment. The lab cluster nodes had a lot of NIC & FC HBA to test all kinds of possible scenarios Microsoft Windows Clusters, S2D, Hyper-V, FC and iSCSI etc. The script detected the iSCSI service but did not update any setting but did throw errors.
After verifying things in the registry myself it was clear that the entries for the Microsoft iSCSI Initiator that the script is looking for are there but the script did not pick them up.
Looking over the script it became clear quickly what the issue was. The variable $IscsiRegPath = “HKLM:\SYSTEM\CurrentControlSet\Control\Class\{4d36e97b-e325-11ce-bfc1-08002be10318}\000*” has 3 leading zeros out of a max of 4 characters. This means that if the Microsoft iSCSI Initiator info is in 0009 it get’s picked up but not when it is in 0011 for example.
So I changed that to only 2 leading zeros. This makes the assumption you won’t exceed 0099 which is a safer assumption, but you could argue this should even be only one leading zero as 999 is an even safer assumption.
I’m sharing the snippet with my adaptation here in case you want it. As always I assume nu responsibility for what you do with the script or the outcomes in your environment. Big boy rules apply.
# MPIO Registry Settings script
# This script will apply recommended Dell Storage registry settings
# on Windows Server 2008 R2 or newer
#
# THIS CODE IS MADE AVAILABLE AS IS, WITHOUT WARRANTY OF ANY KIND.
# THE ENTIRE RISK OF THE USE OR THE RESULTS FROM THE USE OF THIS CODE
# REMAINS WITH THE USER.
# Assign variables
$MpioRegPath = "HKLM:\SYSTEM\CurrentControlSet\Services\mpio\Parameters"
$IscsiRegPath = "HKLM:\SYSTEM\CurrentControlSet\Control\Class\"
#DIDIER adaption to 2 leading zeros instead of 3 as 0010 and 0011 would not be
#found otherwise.This makes the assumption you won't exceed 0099 which is a
#safer #assumption, but you could argue that this should even be only one
#leading zero as 999 is #an even #safer assumption.
$IscsiRegPath += "{4d36e97b-e325-11ce-bfc1-08002be10318}\00*"
# General settings
Set-ItemProperty -Path $MpioRegPath -Name "PDORemovePeriod" -Value 120
Set-ItemProperty -Path $MpioRegPath -Name "PathRecoveryInterval" -Value 25
Set-ItemProperty -Path $MpioRegPath -Name "UseCustomPathRecoveryInterval" -Value 1
Set-ItemProperty -Path $MpioRegPath -Name "PathVerifyEnabled" -Value 1
# Apply OS-specific general settings
$OsVersion = ( Get-WmiObject -Class Win32_OperatingSystem ).Caption
If ( $OsVersion -match "Windows Server 2008 R2" )
{
New-ItemProperty –Path $MpioRegPath –Name "DiskPathCheckEnabled" –Value 1 –PropertyType DWORD –Force
New-ItemProperty –Path $MpioRegPath –Name "DiskPathCheckInterval" –Value 25 –PropertyType DWORD –Force
}
Else
{
Set-ItemProperty –Path $MpioRegPath –Name "DiskPathCheckInterval" –Value 25
}
# iSCSI settings
If ( ( Get-Service -Name "MSiSCSI" ).Status -eq "Running" )
{
# Get the registry path for the Microsoft iSCSI initiator parameters
$IscsiParam = Get-Item -Path $IscsiRegPath | Where-Object { ( Get-ItemProperty $_.PSPath ).DriverDesc -eq "Microsoft iSCSI Initiator"} | Get-ChildItem | Where-Object { $_.PSChildName -eq "Parameters" }
# Set the Microsoft iSCSI initiator parameters
Set-ItemProperty -Path $IscsiParam.PSPath -Name "MaxRequestHoldTime" -Value 90
Set-ItemProperty -Path $IscsiParam.PSPath -Name "LinkDownTime" -Value 35
Set-ItemProperty -Path $IscsiParam.PSPath -Name "EnableNOPOut" -Value 1
}
Else
{
Write-Host "iSCSI Service is not running."
Write-Host "iSCSI registry settings have NOT been configured."
}
Write-Host "MPIO registry settings have been configured successfully."
Write-Host "The system must be restarted for the changes to take effect."