Windows Server 2012 R2 Cluster Reset Recent Events With PowerShell

I blogged before about the fact that since Windows Server 2012  we have the ability to reset the recent events shown so that the state of the cluster is squeaky clean with not warnings or errors. You can read up on this here. Windows Server 2012 Cluster Reset Recent Events Feature.

You can also do this in PowerShell like in the example below:

As you may notice, the RecentEventsResetTime is displayed in UTC when read form the cluster after connecting to it. Right after you set it it displays the time respectful of the time zone you’re in right until you connect to the cluster again. We demonstrate this in the 2 screenshots below (I’m at GMT+1).

image

image

This comes in handy when writing test, comparison & demo scripts. Often you do things with the network that causes network connectivity to be lost when the NIC gets reset (disabled/enabled) and such. Also when something fails as part of the demo or tests scripts it’s nice to start the rerun or the next part of the demo/test with a clean cluster GUI when you’re showcasing stuff. Unfortunately an already GUI doesn’t refresh these setting if the reset is not done in the GUI. So you need to open a new one. For scripting you don’t have this issue. EDIT: In Windows 2012 R2 you can use the $MyCluster.Update() to reflect the new value of RecentEventsResetTime in UTC without having to reconnect to the cluster. In Windows Server 2012 this Update method isn’t available but it seems to happen automatic.

Preventing Live Migration Over SMB Starving CSV Traffic in Windows Server 2012 R2 with Set-SmbBandwidthLimit

One of the big changes in Windows Server 2012 R2 is that all types of Live Migration can now leverage SMB 3.0 if the right conditions are met. That means that Multichannel & SMB Direct (RDMA) come in to play more often and simultaneously. Shared Nothing Live Migration & certain forms of Storage Live Migration are often a lot more planned due to their nature. So one can mitigate the risk by planning.  Good old standard Live Migration of virtual machines however is often less planned. It can be done via Cluster Aware Updating, to evacuate a host for hardware maintenance, via Dynamic optimization. This means it’s often automated as well. As we have demonstrated many times Live Migration can (easily) fill 20Gbps of bandwidth. If you are sharing 2*10Gbps NICs for multiple purposes like CSV, LM, etc. Quality of Service (QoS) comes in to play. There are many ways to achieve this but in our example here I’ll be using DCB  for SMB Direct with RoCE.

New-NetQosPolicy “CSV” –NetDirectPortMatchCondition 445 -PriorityValue8021Action 4
Enable-NetQosFlowControl –Priority 4
New-NetQoSTrafficClass "CSV" -Priority 4 -Algorithm ETS -Bandwidth 40
Enable-NetAdapterQos –InterfaceAlias SLOT41-CSV1+LM2
Enable-NetAdapterQos –InterfaceAlias SLOT42-LM1+CSV2
Set-NetQosDcbxSetting –willing $False

Now as you can see I leverage 2*10Gbps NIC, non teamed as I want RDMA. I have Failover/redundancy/bandwidth aggregation thanks to SMB 3.0. This works like a charm. But when leveraging Live Migration over SMB in Windows Server 2012 R2 we note that the LM traffic also goes over port 445 and as such is dealt with by the same QoS policy on the server & in the switches (DCB/PFC/ETS). So when both CSV & LM are going one how does one prevent LM form starving CSV traffic for example? Especially in Scale Out File Server Scenario’s this could be a real issue.

The Solution

To prevent LM traffic & CSV traffic from hogging all the SMB bandwidth ruining the SOFS party in R2 Microsoft introduced some new capabilities in Windows Server 2012 R2. In the SMBShare module you’ll find:

  • Set-SmbBandwidthLimit
  • Get-SmbBandwidthLimit
  • Remove-SmbBandwidthLimit

image

To use this you’ll need to install the Feature called SMB Bandwidth Limit via Server Manager or using PowerShell:  Add-WindowsFeature FS-SMBBW

You can limit SMB bandwidth for Virtual machine (Storage IO to a SOFS), Live Migration & Default (all the rest).  In the below example we set it to 8Gbps maximum.

Set-SmbBandwidthLimit -Category LiveMigration -BytesPerSecond 1000MB

So there you go, we can prevent Live Migration from hogging all the bandwidth. Jose Baretto mentions this capability on his recent blog post on Windows Server 2012 R2 Storage: Step-by-step with Storage Spaces, SMB Scale-Out and Shared VHDX (Virtual). But what about Fibre Channel or iSCSI environments?  It might not be the total killer there as in SOFS scenario but still. As it turns out the Set-SmbBandwidthLimit also works in those scenarios. I was put on the wrong track by thinking it was only for SOFS scenarios but my fellow MVP Carsten Rachfahl kindly reminded me of my own mantra “Trust but verify” and as a result, I can confirm it even works to cap off Live Migration traffic over SMB that leverages RDMA (RoCE). So don’t let the PowerShell module name (SMBShare) fool you, it’s about all SMB traffic within the categories.

So without limit LM can use all bandwidth (2*10Gbps)

image

With Set-SmbBandwidthLimit -Category LiveMigration -BytesPerSecond 1250MB you can see we max out at 10Gbps (2*5Gbps).
image

Some Remarks

I’d love to see a minimum bandwidth implementation of this (that could include safety buffer for spikes in CSV traffic with SOFS). The hard cap limit might lead to some wasted bandwidth. In other scenarios you could still get into trouble. What if you have 2*10Gbps available but one of those dies on you and you capped Live Migration Traffic at 16Gbps. With one NIC gone you’re potentially in trouble until the NIC has been replaced. OK, this is not a daily occurrence & depending on you environment & setup this is less or more of a potential issue.

PowerShell: Monitoring DrainStatus of a Hyper-V Host & The Time Limited Value of Information In Beta & RC Era Blogs

I was writing some small PowerShell scripts to kick pause and resume Hyper-V cluster hosts and I wanted to monitor the progress of draining the virtual machines of the node when pausing it. I found this nice blog about Draining Nodes for Planned Maintenance with Windows Server 2012 discussing this subject and providing us with the properties to do just that.

It seems we have two common properties at our disposal: NodeDrainStatus and NodeDrainTarget.

image

So I set to work but I just didn’t manage to get those properties to be read. It was like they didn’t exist. So I pinged Jeff Wouters who happens to use PowerShell for just about anything and asked him if it was me being stupid and missing the obvious. Well it turned out to be missing the obvious for sure as those properties do no exist. Jeff told me to double check using:

Get-ClusterNode MyNode -cluster MyCluster | Select-Object -Property *

Guess what, it’s not NodeDrainStatus and NodeDrainTarget but DrainStatus and DrainTarget.

image

What put me off here was the following example in the same blog post:

Get-ClusterResourceType "Virtual Machine" | Get-ClusterParameter NodeDrainMoveTypeThreshold

That should have been a dead give away. As we’ve been using MoveTypeTresHold a lot the recent months and there is no NodeDrain in that value either. But it just didn’t register. By the way you don’t need to create the property either is exists. I guess this code was valid with some version (Beta?) but not anymore. You can just get en set the property like this

Get-ClusterResourceType “Virtual Machine” -Cluster MyCluster | Get-ClusterParameter MoveTypeThreshold

Get-ClusterResourceType “Virtual Machine” -Cluster MyCluster | Set-ClusterParameter MoveTypeThreshold 2000

So lessons learned. Trust but verify Smile.  Don’t forget that a lot of things in IT have a time limited value. Make sure that to look at the date of what you’re reading and about what pre RTM version of the product the information is relevant to.

To conclude here’s the PowerShell snippet I used to monitor the draining process.

Which outputs

image

Monitoring Startup,Shutdown and restart of a Virtual machine With PowerShell 3.0

During scripting some maintenance PowerShell scripts for Hyper-V guests I felt the need for a more accurate way to monitor the startup of a virtual machine. Pings, telnet to a known open port it all doesn’t do the job accurately enough as I want to know when CTRL+AL+DEL appears on the screen. So I pinged Jeff Wouters who told me I could monitor Get-VM -Name DC01 | Get-VMIntegrationService  to detect when PrimaryStatusDescription goes to “OK”.

Now when you look at the Integration services there are 5 of them.

image

Which one is the best to use for our purpose? Well,I tested them out and after some experimenting with the various services I concluded that the PrimaryStatusDescription of the  Key-Value Pair Exchange works best for this purpose. All others become available a bit to soon in the process of starting a VM, which seems logical.

Monitor a starting virtual machine

So how to use this in a script? We’ll here’s a snippet to monitor the boot process of a guest.

 

Monitor a stopping virtual machine

Likewise, sometime we want to monitor a VM shutting down, which is the same code as above but with reverse logic.

Monitor a restarting a virtual machine.

When in a PowerShell script you want to monitor progress of a virtual machine restarting you can combine both. You monitor shutdown and you monitor startup.

Note that in all the above snippets  I’ve thrown some logic in to us the progress bar as an activity bar as I know of no way to calculate real % done in a startup, shutdown, restart process. It looks something like this in ISE

image

or like this in a PowerShell prompt

image