Hyper-V Cluster Nodes Upgrade: Zero Down Time With Intel VT FlexMigration

Well the oldest Hyper-V cluster nodes are 3 + years old. They’ve been running Hyper-V clusters since RTM of Hyper-V for Windows 2008 RTM. Yes you needed to update the “beta” versions to the RTM version of Hyper-V that came later Smile Bit of a messy decision back then but all in all that experience was painless.

These nodes/clusters have been upgraded to W2KR2 Hyper-V clusters very soon after that SKU went RTM but now they have reached the end of their “Tier 1” production life. The need for more capacity (CPU, memory) was felt. Scaling out was not really an option. The cost of fiber channel cards is big enough but fiber channel switch ports need activation licenses and the cost for those border on legalized extortion.

So upgrading to more capable nodes was the standing order. Those nodes became DELL R810 servers. The entire node upgrade process itself is actually quite easy. You just live migrate the virtual machines over to clear a host that you then evict from the cluster. You recuperate the fiber channel HBAs to use in the new node that you than add to the cluster. You just rinse and repeat until you’re done with all nodes. Thank you Microsoft for the easy clustering experience in Windows 2008 (R2)! Those nodes now also have 10Gbps networking kit to work with (Intel X520 DA SPF+).

If you do your home work this process works very well. The cool thing there is not much to do on the SAN/HBA/Fiber Switch configuration side as you recuperate the HBA with their World Wide Names. You just need to updates some names/descriptions to represent the new nodes. The only thing to note is that the cluster validation wizard nags about inconsistencies in node configuration, service packs. That’s because the new nodes are installed with SP1 integrated as opposes to the original ones having been upgraded to SP1 etc.

The beauty is that by sticking to Intel CPUs we could live migrate the virtual machines between nodes having Intel E5430 2.66Ghz CPUs (5400-series "Harpertown") and those having the new X7560 2.27Ghz CPUs (Nehalem EX “Beckton”). There was no need to use the “Allow migration to a virtual machine with a different processor” option.  Intel’s investment (and ours) in VT FlexMigration is paying of as we had a zero down time upgrade process thanks to this.


You can read more about Intel VT FlexMigration here

And in case you’re wondering. Those PE2950 III are getting a second life. Believe it or not there are software vendors that don’t have application live cycle management, Virtualization support or roadmaps to support. So some hardware comes in handy to transplant those servers when needed. Yes it’s 2011 and we’re still dealing with that crap in the cloud era. I do hope the vendors of those application get the message or management cuts the rope and lets them fall.