Live Migration over SMB Direct leaves more CPU cycles for Virtual RSS (vRSS) in Windows Server 2012 R2

I recently (January 22nd 2014) gave a WebCast presentation for the Dutch Windows Management User Group (@WMUG_NL) in which I made the case for using SMB Direct with Live Migration to save CPU cycles other (VM) workloads. There are several areas where the CPU cycles are better spent but I used vRSS to show case one scenario.

We’re using a 2 node Windows Server 2012 R2 Hyper-V cluster on Dell PowerEdge R720 servers with Mellanox ConnectX-3 (CSV  &  live migration) and Intel X520-DA (Hyper-V switch), all 10Gbps.

This is what a CPU bottleneck looks like that can be solved by using vRSS in Windows Server 2012 R2.image

The host machines are Hyper Threading enabled. The virtual switch is attached to a switch independent NIC team with dynamic mode. In this setup it’s normal that the sending VM is leveraging both members while the receiving VM traffic is coming in over one member of the host team.

No let’s enable vRSS in the VM and see what this does for this picture.image

Pretty impressive isn’t it. DidierTest03 is the sending VM running on host A and DidierTest04 is the receiving VM that has vRSS enabled and is running on Host B. For vRSS you need both hosts and VMs to run Windows Server 2012 or Windows 8.1. You can see the load is spread across 7 vCPUs in the VM. DidierTest04 has 8 vCPUs. I configured vRSS in the VM to be able to use 7 vCPUs and leave vCPU 0, the default one, alone to handle those workloads.

image

Given multiple Logical CPUs & vCPUs we can get line speed with 10Gbps inside a virtual machine. This, ladies and gentlemen is a thing of beauty.

Now tell me, if you have business related needs for those CPU cycles why would you not offload the work that needs to be done for live migration to the NIC via SMB direct? This is about getting maximum VM density, performance & ROI form your infrastructure, whilst saving on servers, power and cooling. When you see the smile on your clients or bosses face, just say “you’re welcome” and smile back Open-mouthed smile.

E2EVC Rome 2013–Attending & Speaking

I’m attending and speaking at the E2EVC (Experts2Experts Virtualization Conference) in Rome, November 1-3.

I’ll be doing a talk on networking features in Windows Server 2012 R2 & Hyper-V. The good, the bad & the ugly. And no, no SDN talk. It’s about the other stuff.

“Session Topic: Networking Options For Virtualization in Windows Server 2012 R2
Short description: In Windows 2012 R2 there are many networking options available to optimize for both speed & redundancy. Let’s talk about some of them and see where and when they can help. The biggest problem is thinking you need  all of them or wanting all of them. Others are knowledge& complexity. Join us for a chalk & talk discussing all this. Some fellow MVPs will be there and we all have different experiences in different environments.
Presenter: Didier Van Hoye, Microsoft MVP”

It will be an interactive chalk & talk based on testing & experiences with these features & demos if the internet holds up.

image

It’s a suburb non-commercial, virtualization community Event. It brings the best real life  virtualization experts together to exchange knowledge and to establish new connections. Lots of presentations, Master Classes and discussions going on both the attendees, vendors product teams & independent experts.

Now this is a conference where marketing & marchitecture is shot down fast & hard. But always in a friendly way. These people are all working in the sector and they have to keep it real. They have a business & a livelihood that depends on them delivering results, not fairy tales & management pleasing BS. These people tell you what you need to know, not what you want to hear.

Look at the turnout for this event, this twitter list reads as the “Who’s Who In Virtualization”: @pcrampton, @joe_elway, @shawnbass, @ThomasMaurer, @Andrea_Mauro, @hansdeleenheer, @drtritsch, @WorkingHardInIT, @HelgeKlein, @JimMoyle, @neilspellings, @andyjmorgan, @KBaggerman, @CarlWebster, @bsonposh, @gilwood_cs, @E2EVC, @KristianNese, @_POPPELGAARD, @barryschiffer, @RemkoWeijnen, @IngmarVerheij, @WilcovanBragt, @david_obrien, @Microspecialist, @virtualfat, @LFoverskov, @virtuEs_IT, @stibakke, @granttiller, @Easi123, @ChrisJMarks, @arbeijer, @znackattack, @ShaunRitchie_UK, @Rob_Aarts, @StefanKoell, @EHouben, @crachfahl, @JeroenTielen, @plompr, @Gkunst, @drmiru, @espenbe, @marcdrinkwater, @DocsMortar, @AlBayliss, @wedelit, @pzykomAtle, @LoDani, @fborozan, @JeffWouters, @mrpickford, @smspasscode, @schose, @rvanderkruk, @TimmBrochhaus, @HansMinnee, @JZanten, @Sargola, @JaspervanWesten, @airdeca_nl, @danielBuonocore, @PeppelT, @TondeVreede, @pcortis, @ConorScolard, @CarstenDreyer, @arnaud_pain, @RoyTextor, @saschazimmer, @abstrask, @loopern, @PulseITch, @joarleithe, clarecoops9, @rfolmer, @wimoortgiesen  …

Here’s the conference agenda: http://www.e2evc.com/home/Agenda.aspx

See you there Smile and I’m looking forward to seeing my community buddies!

Windows Server 2012 R2 Virtual RSS (vRSS) In Action

The Need For Speed

My clients or employers are normally into lots of data, large files &  number crunching. That means they like CPU power, memory, bandwidth & IOPS. That doesn’t mean they want to spend fortunes but it does mean I got a standing order to get every last ounce of optimization and efficiency out of commodity hardware.

Today that means we prefer to work with DELL generation 12 servers like the R620 & R720 which are the best possible value for money you can get. They deliver all features in box at no extra cost / licensing crap and they can be fine tuned and optimized for top performance. Windows Server 2012 R2 can handle loads higher than the hardware today can throw at it so you want all you can get without breaking the bank. Add Intel X520/540, Mellanox (RoCE) or Cheslio (iWarp) 10Gbps cards and you’re ready to rumble with some nice PowerConnect 81XX Series or even the Force10 S4810 10Gbps switches.

CPU Power is not an issue, just bring money. You have 8-12 core CPUs. You can scale up to 4 socket systems & scale out as well. Licensing is probably your biggest concern here due to cost.

Memory?  DDR3 is readily available and compared to other costs cheap. DDR4 is on the way. For max performance you possibly won’t do Dynamic Memory to avoid a NUMA hit but otherwise it helps with efficiencies.

Network. We’ll you’ve seen/heard/read me on topics like 10Gbps, RDMA/SMB Direct, (Dynamic VMQ) but today we’ll look at virtual RSS. In Windows Server 2008 R2 VMQ helped us beyond the scalability issues of a single core on the host having to handle all interrupts for the traffic going the virtual machines. In Windows 2012 that got optimized with Dynamic VMQ. For workloads that need the absolute best performance & lowest latency we got SR-IOV support. That works fine but it has some potentially important draw backs on bot the manageability (NIC teaming needs to be done inside the guest) & security front (it bypasses the virtual switch, so no ACLs or NVGRE).

Today with Windows 2012 R2 we have a new optimization for network traffic in a virtual environment. It’s “Virtual RSS” or vRSS. I tell you it’s sweet and you’d do well to investigate this and enable it in your environment especially if, like we, you move lots of data around & virtualization is default. We don’t do physical unless for very strict reasons, i.e. it cannot be virtualized. Otherwise … no excuses, the economics and benefits are just to good not to do so. It’s not as supper low latency as SR-IOV but depending on your needs you might never notice and … it plays nicely with all other network features in Hyper-V.

Show me vRSS already!

Inside the guest we see vRSS in action as multiple cores are put in to action to handle the interrupts of incoming traffic. This takes away the single vCPU bottle neck.

image

And the result is a sweet 17.2Gbps from VM1 to VM2. For the sharp eyed ones amongst you, they are on the same host so yes in this case we get top bandwidth as the traffic doesn’t have to go across the wire over the 2*10Gbps NIC team but stays within the same host or better, vSwitch.

image

The GUI is very friendly and suggest I can go to 100Gbps networking hardware Smile Well, not yet I’m afraid, but I’ve taken note Winking smile.

Here’s what it looks like when the sending and receiving VM are on different hosts and the vSwitch is connected to a 2*10Gbps team, switch independent, dynamic mode. You “only” get 10Gbps as the team can send on all members but receive on only one. But that’s still fine.

image

So how this look like on the host?

image

Here you see Dynamic VMQ in action. To prevent one core of becoming overloaded the host puts more of them into service. This depends on the load and it’s dynamic. Hence Dynamic VMQ. Where VMQ was great it still a limiting factor as you used to be tied to one core / VM.

This means that our network traffic processing is no longer limited by the OS or better the use of a single core, both inside the VM and on the host. Our bottleneck now is the maximum throughput the NICs can deliver. In our 2 member NIC team that’s 20Gbps max under the right circumstances. Yup. Line speed. Need more? Throw 40Gbps NICs at the problem or even 100Gbps pretty soon. Windows Server 2012 R2 is ready for this.

Sharing the wealth

Now to make sure a bunch of these VMs on your cluster don’t starve the rest of the VM population with their greed for and lust after bandwidth you have the option of using Bandwidth management on the VM NIC.

image

This is one of the cases where this option can be very useful.

VM mobility without boundaries

Also consider this capability and this type of high workloads combined while leveraging SMB Direct. This offloads the processing of CSV traffic,  Live Migration, Shared Nothing Live Migration and under certain conditions Storage Live Migration to the NIC by leveraging RDMA.  In other words it doesn’t tax your host CPUs. This means you can have these kinds of network traffic loads going on and still live migrate at will. Scalable VM mobility anyone? You’ll understand what tremendous network loads the combination Windows Server 2012 R2 Hyper-V host & guests can handle.

It’s virtualization, not magic

OK, time for a little reality check, just in case it’s needed. Virtualization is technology, not magic. For all of you “thinking” they can push 20GBps into 20 VMs simultaneously on single host over just a single Team of 2 *10Gbps … ah well, get real Smile OK?

Teaser

I can talk about the benefits of vRSS all day and show you some more screenshots but I’m working on “vRSS, The Movie’”.  Perhaps even a sequel already.

Preventing Live Migration Over SMB Starving CSV Traffic in Windows Server 2012 R2 with Set-SmbBandwidthLimit

One of the big changes in Windows Server 2012 R2 is that all types of Live Migration can now leverage SMB 3.0 if the right conditions are met. That means that Multichannel & SMB Direct (RDMA) come in to play more often and simultaneously. Shared Nothing Live Migration & certain forms of Storage Live Migration are often a lot more planned due to their nature. So one can mitigate the risk by planning.  Good old standard Live Migration of virtual machines however is often less planned. It can be done via Cluster Aware Updating, to evacuate a host for hardware maintenance, via Dynamic optimization. This means it’s often automated as well. As we have demonstrated many times Live Migration can (easily) fill 20Gbps of bandwidth. If you are sharing 2*10Gbps NICs for multiple purposes like CSV, LM, etc. Quality of Service (QoS) comes in to play. There are many ways to achieve this but in our example here I’ll be using DCB  for SMB Direct with RoCE.

New-NetQosPolicy “CSV” –NetDirectPortMatchCondition 445 -PriorityValue8021Action 4
Enable-NetQosFlowControl –Priority 4
New-NetQoSTrafficClass "CSV" -Priority 4 -Algorithm ETS -Bandwidth 40
Enable-NetAdapterQos –InterfaceAlias SLOT41-CSV1+LM2
Enable-NetAdapterQos –InterfaceAlias SLOT42-LM1+CSV2
Set-NetQosDcbxSetting –willing $False

Now as you can see I leverage 2*10Gbps NIC, non teamed as I want RDMA. I have Failover/redundancy/bandwidth aggregation thanks to SMB 3.0. This works like a charm. But when leveraging Live Migration over SMB in Windows Server 2012 R2 we note that the LM traffic also goes over port 445 and as such is dealt with by the same QoS policy on the server & in the switches (DCB/PFC/ETS). So when both CSV & LM are going one how does one prevent LM form starving CSV traffic for example? Especially in Scale Out File Server Scenario’s this could be a real issue.

The Solution

To prevent LM traffic & CSV traffic from hogging all the SMB bandwidth ruining the SOFS party in R2 Microsoft introduced some new capabilities in Windows Server 2012 R2. In the SMBShare module you’ll find:

  • Set-SmbBandwidthLimit
  • Get-SmbBandwidthLimit
  • Remove-SmbBandwidthLimit

image

To use this you’ll need to install the Feature called SMB Bandwidth Limit via Server Manager or using PowerShell:  Add-WindowsFeature FS-SMBBW

You can limit SMB bandwidth for Virtual machine (Storage IO to a SOFS), Live Migration & Default (all the rest).  In the below example we set it to 8Gbps maximum.

Set-SmbBandwidthLimit -Category LiveMigration -BytesPerSecond 1000MB

So there you go, we can prevent Live Migration from hogging all the bandwidth. Jose Baretto mentions this capability on his recent blog post on Windows Server 2012 R2 Storage: Step-by-step with Storage Spaces, SMB Scale-Out and Shared VHDX (Virtual). But what about Fibre Channel or iSCSI environments?  It might not be the total killer there as in SOFS scenario but still. As it turns out the Set-SmbBandwidthLimit also works in those scenarios. I was put on the wrong track by thinking it was only for SOFS scenarios but my fellow MVP Carsten Rachfahl kindly reminded me of my own mantra “Trust but verify” and as a result, I can confirm it even works to cap off Live Migration traffic over SMB that leverages RDMA (RoCE). So don’t let the PowerShell module name (SMBShare) fool you, it’s about all SMB traffic within the categories.

So without limit LM can use all bandwidth (2*10Gbps)

image

With Set-SmbBandwidthLimit -Category LiveMigration -BytesPerSecond 1250MB you can see we max out at 10Gbps (2*5Gbps).
image

Some Remarks

I’d love to see a minimum bandwidth implementation of this (that could include safety buffer for spikes in CSV traffic with SOFS). The hard cap limit might lead to some wasted bandwidth. In other scenarios you could still get into trouble. What if you have 2*10Gbps available but one of those dies on you and you capped Live Migration Traffic at 16Gbps. With one NIC gone you’re potentially in trouble until the NIC has been replaced. OK, this is not a daily occurrence & depending on you environment & setup this is less or more of a potential issue.