Live Migration Can Benefit From Jumbo Frames

Does live migration benefit from Jumbo frames? This question always comes back so I’d just blog it hear again even if I have mentioned it as part of other blog posts. Yes it does! How do I know. Because I’ve tested and used it with Windows Server 2008 R2, 2012 & 2012 R2. Why? because I have a couple of mantra’s:

  • Assumption are the mother of all fuckups
  • Assume makes an ASS out of U and ME
  • Trust but verify

What can I say. I have been doing 10Gbps since for Live Migration with Hyper-V. And let me tell you my experiences with an otherwise completely optimized server (mainly BIOS performance settings): It will help you with up to 20% more bandwidth use.

And thanks to Windows Server 2012 R2 supporting SMB for live migration we can very nicely visualize this with 2*10Gbps NICS, not teamed, used by live migration leveraging SMB Multichannel. On one of the 10Gbps we enable Jumbo Frames on the other one we do not. We than live migrate a large memory VM back and forth. Now you tell me which one is which.

image

Now enable Jumbo frames on both 10Gbps NICs and again we live migrate the large memory VM back and forth. More bandwidth used, faster live migration.

image

I can’t make it any more clear. No jumbo frames will not kill your performance unless you have it messed up end to end. Don’t worry if you have a cheaper switch where you can only enable it switch wide instead op port per port. The switch is a pass through. So unless you set messed up sizes on sender/receiving host that the switch in between can’t handle, it will work even without jumbo frames and without heaven falling down on your head Smile. Configure it correctly, test it, and you’ll see.

Enabling Jumbo Frames Inside Virtual Machines Enhances Throughput & Reduces CPU Load

Let’s play a bit with a Windows Server 2012 R2 Hyper-V cluster with 2*10Gbps Intel X520 teamed switch independent and in dynamic mode, which is optimal for DVMQ. On these NICs we enabled Jumbo frames (and on the switches of course). That team is used to create a virtual switch for consumption by the virtual machines. The switches used are 2*DELL PowerConnect 8132F. So we have full fault tolerance. But the important thing to note is that this is commodity, quality hardware that we can leverage for great results.

Now we’ll compare 2 scenarios. In both test a sending VM will try to saturate a receiving VMs network bandwidth. Due to how this NIC teaming setup works, that’s about 10Gbps in a two member 2*10Gbps team. We work around this by leaving both VMs on the same host, so the traffic doesn’t need to pass across the wire. The VMs have VMQ & vRSS enabled with the host team members having VMQ enabled.

Jumbo frames disabled inside the VM (Both VMs on same host)

Without Jumbo Frames enabled in the Guest VM and all other things being equal the very best we can achieve non-sustained is 21Gbps (average +/- 17Gbps)  receiving traffic in a VM. Not bad, not bad at all.

image

In the picture you can see the host with DVMQ doing it’s job at the left while vRSS is at work in the VM. Pretty clear.

Jumbo frames enabled inside the VM

Now, let’s enable Jumbo frames in the VM.  Fire up PowerShell or use the GUI.

image

Don’t forget to do this on both the sending and the receiving VM Smile. Here we get +/- 30Gbps receiving traffic inside of the VM. A nice improvement isn’t it? Not just that but we consume less CPU resources as well! Sweet Smile

image

Useful Power at your finger tips or just showing off?

vRSS & DVMQ is one of may scalability & performance improvements in Windows 2012 R2. And yes, Jumbo Frames inside the VM do make a difference but in a 10Gbps environment it’s not  “in your face” that visible. The 10Gbps limit of a single NIC team member makes this your bottle neck. But it DOES help to reduce CPU cycles in that case. Just look at the two screen shots below.

No Jumbo frames in VM (sending & receiving VMs on different hosts)

image

We’re consuming 7% CPU resources on the host and 15% in the VM.

Jumbo Frames in VM (sending & receiving VMs on different hosts)

image

We’ve dropped down to 3% on the host and to 9% inside the VM. All bits help I say!

How far can we push this?

Again if you take the NIC Team bottleneck out of the way you can see some serious differences. Take a look at the screenshot below, that’s 36,2Gbps inside of a VM courtesy of vRSS during some other experiments. Tallyho!

image

So let’s face it, I guess we’ll need some faster memory (DDR4) and multiple 40Gbps/100Gbps Cards to see what the limits of Windows Server 2012 R2 are or to find out if we reached it. Right now the operating system is giving hardware people a run for their money.

Also note that if you have a number of VMs doing a lot of network IO you’ll be using quite a number of CPU cycles. While vRSS & DVMQ make this scale you might want to consider leveraging SMB Direct for the various tastes of live migration as this will definitely help you out on that front as the NIC will do the heavy lifting.

Reality Check

But perhaps we also need little reality check. While 100Gbps and DDR4 is very nice you might not need it for your current needs. When the environment is built right you’ll find that your apps are usually your limiting factor before the hardware, let alone Windows Server 2012 R2.  So why is knowing this important? Well I verify Microsoft claims so I can talk from experience and not just from what I read in a Microsoft presentation. Secondly you can trust that your investment in Windows Server 2012 R2 is going to carry you long and far. It’s future proofed and that’s god for you. Both when your when needs grow exponential and for the longevity of your environment. Third, we can leverage this to virtualize high through put environments and get the best possible results and ROI.

Also, please, please test & find out, verify what settings suit your environment best and just just blindly enable stuff. Good luck!

E2EVC Rome 2013–Attending & Speaking

I’m attending and speaking at the E2EVC (Experts2Experts Virtualization Conference) in Rome, November 1-3.

I’ll be doing a talk on networking features in Windows Server 2012 R2 & Hyper-V. The good, the bad & the ugly. And no, no SDN talk. It’s about the other stuff.

“Session Topic: Networking Options For Virtualization in Windows Server 2012 R2
Short description: In Windows 2012 R2 there are many networking options available to optimize for both speed & redundancy. Let’s talk about some of them and see where and when they can help. The biggest problem is thinking you need  all of them or wanting all of them. Others are knowledge& complexity. Join us for a chalk & talk discussing all this. Some fellow MVPs will be there and we all have different experiences in different environments.
Presenter: Didier Van Hoye, Microsoft MVP”

It will be an interactive chalk & talk based on testing & experiences with these features & demos if the internet holds up.

image

It’s a suburb non-commercial, virtualization community Event. It brings the best real life  virtualization experts together to exchange knowledge and to establish new connections. Lots of presentations, Master Classes and discussions going on both the attendees, vendors product teams & independent experts.

Now this is a conference where marketing & marchitecture is shot down fast & hard. But always in a friendly way. These people are all working in the sector and they have to keep it real. They have a business & a livelihood that depends on them delivering results, not fairy tales & management pleasing BS. These people tell you what you need to know, not what you want to hear.

Look at the turnout for this event, this twitter list reads as the “Who’s Who In Virtualization”: @pcrampton, @joe_elway, @shawnbass, @ThomasMaurer, @Andrea_Mauro, @hansdeleenheer, @drtritsch, @WorkingHardInIT, @HelgeKlein, @JimMoyle, @neilspellings, @andyjmorgan, @KBaggerman, @CarlWebster, @bsonposh, @gilwood_cs, @E2EVC, @KristianNese, @_POPPELGAARD, @barryschiffer, @RemkoWeijnen, @IngmarVerheij, @WilcovanBragt, @david_obrien, @Microspecialist, @virtualfat, @LFoverskov, @virtuEs_IT, @stibakke, @granttiller, @Easi123, @ChrisJMarks, @arbeijer, @znackattack, @ShaunRitchie_UK, @Rob_Aarts, @StefanKoell, @EHouben, @crachfahl, @JeroenTielen, @plompr, @Gkunst, @drmiru, @espenbe, @marcdrinkwater, @DocsMortar, @AlBayliss, @wedelit, @pzykomAtle, @LoDani, @fborozan, @JeffWouters, @mrpickford, @smspasscode, @schose, @rvanderkruk, @TimmBrochhaus, @HansMinnee, @JZanten, @Sargola, @JaspervanWesten, @airdeca_nl, @danielBuonocore, @PeppelT, @TondeVreede, @pcortis, @ConorScolard, @CarstenDreyer, @arnaud_pain, @RoyTextor, @saschazimmer, @abstrask, @loopern, @PulseITch, @joarleithe, clarecoops9, @rfolmer, @wimoortgiesen  …

Here’s the conference agenda: http://www.e2evc.com/home/Agenda.aspx

See you there Smile and I’m looking forward to seeing my community buddies!

Windows Server 2012 R2 Virtual RSS (vRSS) In Action

The Need For Speed

My clients or employers are normally into lots of data, large files &  number crunching. That means they like CPU power, memory, bandwidth & IOPS. That doesn’t mean they want to spend fortunes but it does mean I got a standing order to get every last ounce of optimization and efficiency out of commodity hardware.

Today that means we prefer to work with DELL generation 12 servers like the R620 & R720 which are the best possible value for money you can get. They deliver all features in box at no extra cost / licensing crap and they can be fine tuned and optimized for top performance. Windows Server 2012 R2 can handle loads higher than the hardware today can throw at it so you want all you can get without breaking the bank. Add Intel X520/540, Mellanox (RoCE) or Cheslio (iWarp) 10Gbps cards and you’re ready to rumble with some nice PowerConnect 81XX Series or even the Force10 S4810 10Gbps switches.

CPU Power is not an issue, just bring money. You have 8-12 core CPUs. You can scale up to 4 socket systems & scale out as well. Licensing is probably your biggest concern here due to cost.

Memory?  DDR3 is readily available and compared to other costs cheap. DDR4 is on the way. For max performance you possibly won’t do Dynamic Memory to avoid a NUMA hit but otherwise it helps with efficiencies.

Network. We’ll you’ve seen/heard/read me on topics like 10Gbps, RDMA/SMB Direct, (Dynamic VMQ) but today we’ll look at virtual RSS. In Windows Server 2008 R2 VMQ helped us beyond the scalability issues of a single core on the host having to handle all interrupts for the traffic going the virtual machines. In Windows 2012 that got optimized with Dynamic VMQ. For workloads that need the absolute best performance & lowest latency we got SR-IOV support. That works fine but it has some potentially important draw backs on bot the manageability (NIC teaming needs to be done inside the guest) & security front (it bypasses the virtual switch, so no ACLs or NVGRE).

Today with Windows 2012 R2 we have a new optimization for network traffic in a virtual environment. It’s “Virtual RSS” or vRSS. I tell you it’s sweet and you’d do well to investigate this and enable it in your environment especially if, like we, you move lots of data around & virtualization is default. We don’t do physical unless for very strict reasons, i.e. it cannot be virtualized. Otherwise … no excuses, the economics and benefits are just to good not to do so. It’s not as supper low latency as SR-IOV but depending on your needs you might never notice and … it plays nicely with all other network features in Hyper-V.

Show me vRSS already!

Inside the guest we see vRSS in action as multiple cores are put in to action to handle the interrupts of incoming traffic. This takes away the single vCPU bottle neck.

image

And the result is a sweet 17.2Gbps from VM1 to VM2. For the sharp eyed ones amongst you, they are on the same host so yes in this case we get top bandwidth as the traffic doesn’t have to go across the wire over the 2*10Gbps NIC team but stays within the same host or better, vSwitch.

image

The GUI is very friendly and suggest I can go to 100Gbps networking hardware Smile Well, not yet I’m afraid, but I’ve taken note Winking smile.

Here’s what it looks like when the sending and receiving VM are on different hosts and the vSwitch is connected to a 2*10Gbps team, switch independent, dynamic mode. You “only” get 10Gbps as the team can send on all members but receive on only one. But that’s still fine.

image

So how this look like on the host?

image

Here you see Dynamic VMQ in action. To prevent one core of becoming overloaded the host puts more of them into service. This depends on the load and it’s dynamic. Hence Dynamic VMQ. Where VMQ was great it still a limiting factor as you used to be tied to one core / VM.

This means that our network traffic processing is no longer limited by the OS or better the use of a single core, both inside the VM and on the host. Our bottleneck now is the maximum throughput the NICs can deliver. In our 2 member NIC team that’s 20Gbps max under the right circumstances. Yup. Line speed. Need more? Throw 40Gbps NICs at the problem or even 100Gbps pretty soon. Windows Server 2012 R2 is ready for this.

Sharing the wealth

Now to make sure a bunch of these VMs on your cluster don’t starve the rest of the VM population with their greed for and lust after bandwidth you have the option of using Bandwidth management on the VM NIC.

image

This is one of the cases where this option can be very useful.

VM mobility without boundaries

Also consider this capability and this type of high workloads combined while leveraging SMB Direct. This offloads the processing of CSV traffic,  Live Migration, Shared Nothing Live Migration and under certain conditions Storage Live Migration to the NIC by leveraging RDMA.  In other words it doesn’t tax your host CPUs. This means you can have these kinds of network traffic loads going on and still live migrate at will. Scalable VM mobility anyone? You’ll understand what tremendous network loads the combination Windows Server 2012 R2 Hyper-V host & guests can handle.

It’s virtualization, not magic

OK, time for a little reality check, just in case it’s needed. Virtualization is technology, not magic. For all of you “thinking” they can push 20GBps into 20 VMs simultaneously on single host over just a single Team of 2 *10Gbps … ah well, get real Smile OK?

Teaser

I can talk about the benefits of vRSS all day and show you some more screenshots but I’m working on “vRSS, The Movie’”.  Perhaps even a sequel already.