Recover From Expanding VHD or VDHX Files On VMs With Checkpoints

So you’ve expanded the virtual disk (VHD/VHDX) of a virtual machine that has checkpoints (or snapshots as they used to be called) on it. Did you forget about them?  Did you really leave them lingering around for that long?  Bad practice and not supported (we don’t have production snapshots yet, that’s for Windows Server 2016). Anyway your virtual machine won’t boot. Depending on the importance of that VM you might be chewed out big time or ridiculed. But what if you don’t have a restore that works? Suddenly it’s might have become a resume generating event.

All does not have to be lost. Their might be hope if you didn’t panic and made even more bad decisions. Please, if you’re unsure what to do, call an expert, a real one, or at least some one who knows real experts. It also helps if you have spare disk space, the fast sort if possible and a Hyper-V node where you can work without risk. We’ll walk you through the scenarios for both a VHDX and a VHD.

How did you get into this pickle?

If you go to the Edit Virtual Hard Disk Wizard via the VM settings it won’t allow for that if the VM has checkpoints, whether the VM is online or not.

image

VHDs cannot be expanded on line. If the VM had checkpoints it must have been shut down when you expanded the VHD. If you went to the Edit Disk tool in Hyper-V Manager directly to open up the disk you don’t get a warning. It’s treated as a virtual disk that’s not in use. Same deal if you do it in PowerShell

Resize-VHD -Path “C:\ClusterStorage\Volume2\DidierTest06\Virtual Hard Disks\RuinFixedVHD.vhd” -SizeBytes 15GB

That just works.

VHDXs can be expanded on online if they’re attached to a vSCSI controller. But if the VM has checkpoints it will not allow for expanding.

image

So yes, you deliberately shut it down to be able to do it with the the Edit Disk tool in Hyper-V Manager. I know, the warning message was not specific enough but consider this. The Edit disk tool when launched directly has no idea of what the disk you’re opening is used for, only if it’s online / locked.

Anyway the result is the same for the VM whether it was a VHD or a VHDX. An error when you start it up.

[Window Title]
Hyper-V Manager

[Main Instruction]
An error occurred while attempting to start the selected virtual machine(s).

[Content]
‘DidierTest06’ failed to start.

Synthetic SCSI Controller (Instance ID 92ABA591-75A7-47B3-A078-050E757B769A): Failed to Power on with Error ‘The chain of virtual hard disks is corrupted. There is a mismatch in the virtual sizes of the parent virtual hard disk and differencing disk.’.

Virtual disk ‘C:\ClusterStorage\Volume2\DidierTest06\Virtual Hard Disks\RuinFixedVHD_8DFF476F-7A41-4E4D-B41F-C639478E3537.avhd’ failed to open because a problem occurred when attempting to open a virtual disk in the differencing chain, ‘C:\ClusterStorage\Volume2\DidierTest06\Virtual Hard Disks\RuinFixedVHD.vhd’: ‘The size of the virtual hard disk is not valid.’.

image

You might want to delete the checkpoint but the merge will only succeed for the virtual disk that have not been expanded.  You actually don’t need to do this now, it’s better if you don’t, it saves you some stress and extra work. You could remove the expanded virtual disks from the VM. It will boot but in many cased the missing data on those disks are very bad news. But al least you’ve proven the root cause of your problems.

If you inspect the AVVHD/AVHDX file you’ll get an error that states

The differencing virtual disk chain is broken. Please reconnect the child to the correct parent virtual hard disk.

image

However attempting to do so will fail in this case.

Failed to set new parent for the virtual disk.

The Hyper-V Virtual Machine Management service encountered an unexpected error: The chain of virtual hard disks is corrupted. There is a mismatch in the virtual sizes of the parent virtual hard disk and differencing disk. (0xC03A0017).

image

Is there a fix?

Let’s say you don’t have a backup (shame on you). So now what? Make copies of the VHDX/AVHDX or VHD/AVHD and save guard those. You can also work on copies or on the original files.I’ll just the originals as this blog post is already way too long. If you. Note that some extra disk space and speed come in very handy now. You might even copy them of to a lab server. Takes more time but at least you’re not working on a production host than.

Working on the original virtual disk files (VHD/AVHD and / or VHDX/AVHDX)

If you know the original size of the VHDX before you expanded it you can shrink it to exactly that. If you don’t there’s PowerShell to the rescue if you want to find out the minimum size.

image

But even better you can shrink it to it’s minimum size, it’s a parameter!

Resize-VHD -Path “C:\ClusterStorage\Volume2\DidierTest06\Virtual Hard Disks\RuinFixedVHD.vhd” -ToMinimumSize

Now you not home yet. If you restart the VM right now it will fail … with the following error:

‘DidierTest06’ failed to start. (Virtual machine ID 7A54E4DB-7CCB-42A6-8917-50A05354634F)

‘DidierTest06’ Synthetic SCSI Controller (Instance ID 92ABA591-75A7-47B3-A078-050E757B769A): Failed to Power on with Error ‘The chain of virtual hard disks is corrupted. There is a mismatch in the identifiers of the parent virtual hard disk and differencing disk.’ (0xC03A000E). (Virtual machine ID 7A54E4DB-7CCB-42A6-8917-50A05354634F)

image

What you need to do is reconnect the AVHDX to it’s parent and choose to ignore the ID mismatch. You can do this via Edit Disk in Hyper-V Manager of in PowerShell. For more information on manually merging & repairing checkpoints see my blogs on this subject here. In this post I’ll just show the screenshots as walk through.

image

image

image

image

image

Once that’s done you’re VHDX is good to go.

For a VHD you can’t shrink that with the inbox tools. There is however a free command line tool that can do that names VHDTool.exe. The original is hard to find on the web so here is the installer if you need it. You only need the executable, which is portable actually, don’t install this on a production server. It has a repair switch to deal with just this occurrence!

Here’s an example of my lab …

D:\SysAdmin>VhdTool.exe /repair “C:\ClusterStorage\Volume2\DidierTest06\Virtual Hard Disks\RuinFixedVHD.vhd” “C:\ClusterStorage\Volume2\DidierTest06\Virtual Hard Disks\RuinFixedVHD_8DFF476F-7A41-4E4D-B41F-C639478E3537.avhd”

image

That’s it for the VHD …

You’re back in business!  All that’s left to do is get rid of the checkpoints. So you delete them. If you wanted to apply them an get rid of the delta, you could have just removed the disks, re-added the VHD/VHDX and be done with it actually. But in most of these scenarios you want to keep the delta as you most probably didn’t even realize you still had checkpoints around. Zero data loss Winking smile.

Conclusion

Save your self the stress, hassle and possibly expense of hiring an expert.  How? Please do not expand a VHD or VHDX of a virtual machine that has checkpoints. It will cause boot issues with the expanded virtual disk or disks! You will be in a stressful, painful pickle where you might not get out of if you make the wrong decisions and choices!

As a closing note, you must have have backups and restores that you have tested. Do not rely on your smarts and creativity or that others, let alone luck. Luck runs out. Otions run out. Even for the best and luckiest of us. VEEAM has save my proverbial behind a few times already.

Reading Windows Server 2016 Hyper-V Configuration Files

By now a lot of you will have heard that in Windows Server  2016 virtual machines have a new configuration file format. The new configuration files use the .VMCX file extension for virtual machine configuration data and the .VMRS file extension for runtime state data.  These are binary, so no more xml and VSV/BIN files.

image

The new format was designed with a couple of goals in mind:

  • Increase the efficiency of reading and writing virtual machine configuration data.When you start scaling to thousands, tens of thousands and more VMs the overhead of parsing XML begins to weigh on performance.
  • Reduce the potential for data corruption in the event of a storage failure. It has logging included. So after making the virtual hard disks (thanks to VHDX introduced in Windows Server 2012) more resilient to corruption the configuration files have followed.
  • Perhaps the reason with the least emphasis but it also prevents us from mucking around in the XML files. Yes we didn’t just read them. We edited them, we “experimented”, we copied them and fix the security settings … all god when you accept the big boys rule of ”it’s your responsibility and not supported” but I’m afraid Microsoft support got a fair share of calls related to this.

Note: With PowerShell & Compare-VM we can still correct inconsistencies on a file. That’s what Compare-VM is all about => getting the info you need to correct issues so you can register or import a virtual machine.

Let’s take a peak under the hood at a VMCX file. As you can see it’s not human readable.

image

I’m fine with that as long as I have the tools I need to get the information I need. The Hyper-V Manager GUI is one of those tools. But it doesn’t always show me everything I might need. An example of this is when things have gone wrong and I need to manually merge snapshots. There is valuable information in the VHDX/AVHDX configuration in the virtual machine configuration that gives me clues and pointers as what to do.

image

So how do I deal with reading Windows Server 2016 Hyper-V configuration files now? It was one of my first questions to the Product Group actually. The answer I got from Ben Armstrong was: PowerShell and Compare-VM.  He has a blog post on this here.

The gist is that we create a temp VM object using Compare-VM en get the configuration information out that way.

$TempVM = (Compare-VM -Copy -Path $VMConfig -GenerateNewID).VM

$TempVM | select *

In my example I used a VM with some checkpoints and as you can see we can get the information on the parent checkpoint name (“HELLO”) and everything else we desire.

image

image

By querying the $TempVM a bit more we get the hard drive info and see which is the current AVHDX we’re running on. image

Hope this gives you some pointers.

I’m a Veeam Vanguard 2015

Veeam has announced it’s Veeam Vanguard program last month while I was on vacation. I am honored to have been nominated as 1 of 31 professionals world wide. Veeam states the following, which I consider to be a great compliment:

These individuals have been nominated as Veeam Vanguards for 2015. A Veeam Vanguard represents our brand to the highest level in many of the different technology communities in which we engage. These individuals are chosen for their acumen, engagement and style in their activities on and offline.

veeam_vanguard

Rick Vanover is spearheading this program together with the Veeam Product Strategy Team and the entire company is behind this initiative as you can read here What is the Veeam Vanguard Program?

Veeam now has a program like the VMware vExpert, Cisco Champion and Microsoft MVP programs. I’m honored to be nominated and I’m sure Veeam will execute this well as I have one very consistent experience with both Veeam employees and products: quality and dedication to deliver the best possible solutions for their customers. The fact that I’ve been nominated makes me feel appreciated by people whom I respect for their professionalism and skills. As I’m confortable acting as the tip of the spear implementing technologies at the organizations I support I kind of feel that being a Veeam Vanguard is a great fit Smile

I have shared insights, ideas and feedback with VEEAM before and I’m sure we’ll get plenty of opportunities to do even more of that in the future.

DELL SonicWALL Site-to-Site VPN Options With Azure Networking

The DELL SonicWALL product range supports both policy based and route based VPN configurations. Specifically for Azure they have a configuration guide out there that will help you configure either.

Technically, networking people prefer to use route based configuration. It’s more flexible to maintain in the long run. As life is not perfect and we do not control the universe, policy based is also used a lot. SonicWALL used to be on the supported list for both a Static and Dynamically route Azure VPN connections. According to this thread it was taken off because some people had reliability issues with performance. I hope this gets fixed soon in a firmware release. Having that support is good for DELL as a lot of people watch that list to consider what they buy and there are not to many vendors on it in the more budget friendly range as it is. The reference in that thread to DELL stating that Route-Based VPN using Tunnel Interface is not supported for third party devices, is true but a bit silly as that’s a blanket statement in the VPN industry where there is a non written rule that you use route based when the devices are of the same brand and you control both points. But when that isn’t the case, you go a policy based VPN, even if that’s less flexible.

My advise is that you should test what works for you, make your choice and accept the consequences. In the end it determines only who’s going to have to fix the problem when it goes wrong. I’m also calling on DELL to sort this out fast & good.

A lot of people get confused when starting out with VPNs. Add Azure into the equation, where we also get confused whilst climbing the learning curve, and things get mixed up. So here a small recap of the state of Azure VPN options:

  • There are two to create a Site-to-Site VPN VPN between an Azure virtual network (and all the subnets it contains) and your on premises network (and the subnets it contains).
    1. Static Routing: this is the one that will work with just about any device that supports policy based VPNs in any reasonable way, which includes a VPN with Windows RRAS.
    2. Dynamic Routing: This one is supported with a lot less vendors, but that doesn’t mean it won’t work. Do your due diligence. This also works with Windows RRAS

Note: Microsoft now has added a a 3rd option to it’s Azure VPN Gateway offerings, the High Performance VPN gateway, for all practical purposes it’s dynamic routing, but a more scalable version. Note that this does NOT support static routing.

The confusion is partially due to Microsoft Azure, network industry and vendor terminology differing from each other. So here’s the translation table for DELL SonicWALL & Azure

Dynamic Routing in Azure Speak is a Route-Based VPN in SonicWALL terminology and is called and is called Tunnel Interface in the policy type settings for a VPN.

image

Static Routing in Azure Speak is a Policy-Based VPN in SonicWALL terminology and is called Site-To-Site in the “Policy Type” settings for a VPN.

image

  • You can only use one. So you need to make sure you won’t mix the two on both sites as that won’t work for sure.
  • Only a Pre-Shared Key (PSK) is currently supported for authentication. There is no support yet for certificate based authentication at the time of writing).

Also note that you can have 10 tunnels in a standard Azure site-to-site VPN which should give you enough wiggling room for some interesting scenarios. If not scale up to the high performance Azure site-to site VPN or move to Express Route. In the screenshot below you can see I have 3 tunnels to Azure from my home lab.

image
I hope this clears out any confusion around that subject!