Concluding My Summit, Conference & Community Engagements for 2014

After Redmond (MVP Global Summit 2014), which was a great experience I flew to Berlin to attend and speak at the Microsoft Technical Summit 2014 on “What’s New In Windows Server 2012 R2 Clustering”. Germany has a seriously engaged ITPro & Dev scene, that’s for sure, and the session room was packed! Afterwards some interesting questions popped up in the hallways. That’s great as question really make us think about technologies and solutions from other view points and perspectives.

image

After Berlin I was off to Experts Live 2014 in Ede (The Netherlands) where I presented on “The capable & Scalable Cloud OS”. The talk went well and I had a great crowd attending with whom I had some great chats after the session.

image

That concluded the third leg of my international road tour where I invest in myself, the community & the people I work with. Never ever stop learning Smile. Normally this also concludes my traveling schedule for 2014 unless I’m needed/requested somewhere to help out. Being an MVP is about sharing in the community. The only way to prosper is to share the knowledge, experience and the wealth. It provides for a healthy ecosystem from which we all reap the benefits. This should be promoted and facilitated. There is too much expertise & knowledge not being leveraged due to the fact it’s economically unfeasible, and that’s a waste when people are screaming for IT skills. In a war for talent, any waste is surely very counter productive?

Golden Nuggets: Windows Server 2012 R2 Failover Cluster CSV Placement Policy

Some enhancements only become truly evident to people when they see them in action. For many features this means something need to go wrong before they kick in. Others are more visible during normal operations. This is the case with the CSV enhancements in Windows Server 2012 R2 Failover Clustering.

One golden nugget here is the CSV placement policy (which really shines in combination with SOFS/Storage Spaces). This will spread ownership of the CSV amongst the cluster nodes to ensure a balanced distribution. In a failover cluster, one node is the “coordinator node” (owner) for a CSV. The coordinator node owns the physical disk resource that is associated with a logical unit (LUN). All I/O operations for the File System on that LUN are are through the coordinator node. In previous versions there is no automatic rebalancing of coordinator node assignment. This means that all LUNs could potentially be owned by the same node. In storage spaces & SOFS scenarios becomes even more important.

The benefits

  • It helps all nodes carry their share of the workload as it load balances the disk I/O.
  • Failovers of CSV owners are potentially quicker and more predictable/consistent as an even distribution ensures that no one node owns a disproportionate number of CSVs.
  • When losing storage access the number of CSVs that are in redirected mode is potentially less as they are evenly distributed. In an unbalanced cluster it could be for all of them in a worse case scenario.
  • When using SOFS with Storage Spaces it makes sure the Storage Spaces Ownership is distributed fairly.

When does it happen

  • Each time a node leaves or joins the cluster. This means you don’t need to intervene manually or via PowerShell to get an even distribution. This goes for both exiting nodes as when adding a new node. The new node will get a CSV assigned if there is any on surplus on one of the existing nodes.
  • The process also works when you start a failover cluster when it has shut down.

When customers see this in action (it’s most obvious when then add a node as then they are normally watching) they generally smile as the cluster does it job getting  the best possible results out of their hardware.

Windows Server 2012 R2 Clustering brings improved CSV diagnosability

Cluster Shared Volumes (CSV) have can go into different redirected access modes for several reasons. Now a lot of people get (or got) worried about seeing “redirected access” in the GUI. Most of the time however this is due to normal operations such as backups or maintenance (defragmentation) not only losing disk access.

To remediate unneeded troubleshooting, sometimes leading to real issues, calls to MSFT support and so forth it was hidden from the Failover Clustering GUI in Windows 2012 R2. OK, so goal achieved but how do we now troubleshoot and view redirected access that might indicate the presence of real issues? The answer to that is the Get-ClusterSharedVolumeState PowerShell cmdlet. It displays the state of the CSVs on a per node basis for a cluster. You’ll see the type of the IO (Direct, File System Redirected and Block Redirected), if it’s completely unavailable  as well as the reason.

This is what the output looks like on a two node cluster where node A has lost it’s storage path or paths (MPIO) to the CSV. You’ll see that both CSV are in redirected access. Not only that but you can see what type (block redirected) and why (no disk connectivity).

image

Pretty neat and clear. I love this functionality by the way and It’s why I’m leveraging 10Gbps Ethernet extensively to make sure that CSV traffic get’s the bandwidth & latency to handle what it has to. If you realize it leverages SMB3 which provides SMB Multichannel and SMB Direct you know it will get the job done for you in your time of need.

While this is happening in the GUI you’ll see this

image

Nothing is going on … it would seem so a bit of monitoring and alerting would be of use here. The good news is finding out what’s up is very straight forward now.

Now there is still a case where you’ll see that the CSV is in redirected access mode and that when you’ve put it in there yourself via the GUI

image

or via PowerShell for maintenance reasons.

image

As you can see the Icon has change to a networked disk one and it states “Redirected Access”. With Get-ClusterSharedVolumeState the output looks like this.

image

You’ll always see warnings in the event logs.

image

So monitor those with SCOM or another tool that suits your taste and you’ll be in good shape to react when it’s needed and you now know how to find out what’s going on.

First experiences with a rolling cluster upgrade of a lab Hyper-V Cluster (Technical Preview)

Introduction

In vNext we have gotten a long awaited  & very welcome new capability: rolling cluster upgrades. Which for the Hyper-V roles is a 100% zero down time experience. The only step that will require some down time is the upgrade of the virtual machine configuration files to vNext (version 5 to 6) as the VM has to be shut down for this.

How to

The process for a rolling upgrade is so straight forward I’ll just give you a quick bullet list of the first part of the process:

  • Evacuate the workload from the cluster node you’re going to upgrade
  • Evict the node to upgrade to vNext from the cluster
  • Upgrade (no in place upgrade supported but in your lab you can get away with it)
  • Add the upgraded node to the cluster
  • Rinse & repeat until all nodes have been upgraded (that can take a while with larger clusters)

Please note that all actions you administration you do on a cluster in mixed mode should be done from a node running vNext or a system running Windows 10 with the vNext RSAT installed.

Once you’ve upgraded all nodes in the cluster, the situation you’re in now is basically that you’re running a Windows Server vNext Hyper-V cluster in cluster functional level 8 (W2K12R2) and the next step is to upgrade to 9, which is vNext, no there no 10 yet in server Winking smile

You do this by executing the Update-ClusterFunctionalLevel cmdlet. This is an online process.  Again, do this from a node running vNext or a system running Windows 10 with the vNext RSAT installed. Note that this is where you’re willing to commit to the vNext level for the cluster. That’s where you want to go but you get to decide when. When you’ve do this you can’t go back to W2K12R2. It’s a matter of fact that as long as you’re running cluster functional level 8, you can reverse the entire process. Talk about having options! I like having options, just ask Carsten Rachfahl (@hypervserver), he’ll tell you it’s one of my mantras.

image

When this goes well you can just easily check the cluster functional level as follows:

image

When this is done you can do the upgrade of the VM configuration by running the Update-VMConfigurationVersion cmdlet. This is an off line process where the VMs you’re updating have to be shut down. You can do this for just one VM, all or anything in between. This is when you decided you’re committing to all the goodness vNext brings you.  But the fact that you have some time before you need to do it means you can  easily get those machine to run smoothly on a W2K12R2 cluster in case you need to roll back. Remember, options are good!

Doing so updates VM version from 5 to 6 and enables new Hyper-V features (hit F5 a lot or reopen Hyper-V Manager to see the value change.

image

image

Note: If in the lab you’re running some VMs on a cluster node are not highly available (i.e. they’re not clustered) they cannot be updated until the cluster functional level has been upgraded to version 9.