I’ve teamed up with Altaro as a guest in one of their Hyper-V webinars to discuss trouble shooting Microsoft Hyper-V. I’ll be joining my fellow Microsoft Cloud and Datacenter Management Andy Syrewicze for this in a virtual cross Atlantic team
We’ll give you some pointers on good practices and where to start when things go south. We’ll also add some real world examples to the mix to spice things up. As you can imagine these issues are often a lot more fun after the facts and after having solved them. If you work in IT long enough you know that one day (or night) trouble will come knocking and that the stress related with that can be gut wrenching.
The good news is that it isn’t the end of the world. In well designed and managed environment you can minimize downtime and you do not have to suffer through unrecoverable losses as long as you’re well prepared. We hope this webinar will help the attendees prevent those problems in their environment. If not, it will provide you with some insight on how to prepare for and handle them.
The time has been chosen to make it feasible for as many many people across all time zones to attend. So, go on, register, it’s free and both Andy and I are looking forward to sharing our trouble shooting tales from the trenches.
One night I was doing some maintenance on a Hyper-V cluster and I wanted to Pause and drain one of the nodes that was up next for some tender loving care. But I was greeted by some messages:
[Window Title] Resource Status
[Main Instruction] The requested operation cannot be completed because a resource has locked status.
[Content] The requested operation cannot be completed because a resource has locked status.
Strange, the cluster is up and running, none of the other nodes had issues and operational wise all VMs are happy as can be. So what’s up? Not to much in the error logs except for this one related to a backup. Aha …We fire up disk part and see some extra LUNs mounted + using “vssadmin list writers“ we find:
Bingo! Hello old “friend”, I know you! The Microsoft Hyper-V VSS Writer goes into an error state during the making of hardware snapshots of the LUNs due to almost or completely full partitions inside the virtual machines. Take a look at this blog post on what causes this and how to fix fit. As a result we can’t do live migrations anymore or Pause/Drain the node on which the hardware snapshots are being taken.
And yes, after fixing the disk space issue on the VM (a SDT who’s pumped the VM disks 99.999% full) the Hyper-V VSS writer get’s out of the error state and the hardware provider can do it’s thing. After the snapshots had completed everything was fine and I could continue with my maintenance.