Musings On Switch Embedded Teaming, SMB Direct and QoS in Windows Server 2016 Hyper-V

When you have been reading up on what’s new in Windows Server 2016 Hyper-V networking you probably read about Switch Embedded Teaming (SET). Basically this takes the concept of teaming and has this done by the vSwitch. Which means you don’t have to team at the host level. The big benefit that this opens up is the RDMA can be leveraged on vNICs. With host based teaming the RDMA capabilities of your NICs are no longer exposed, i.e. you can’t leverage RDMA. Now this has become possible and that’s pretty big.

clip_image001

With the rise of 10, 25, 40, 50 and 100 Gbps NICs and switches the lure to go fully converged becomes even louder. Given the fact that we now don’t lose RDMA capabilities to the vNICs exposed to the host that call sounds only louder to many.  But wait, there’s even more to lure us to a fully converged solution, the fact that we now do no longer lose RSS on those vNICs! All good news.

I have written an entire whitepaper on convergence and it benefits, drawback, risks & rewards. I will not repeat all that here. One point I need to make that lossless traffic and QoS are paramount to the success of fully converged networking. After all we don’t want lossy storage traffic and we need to assure adequate bandwidth for all our types of traffic. For now, in Technical Preview 3 we have support for Software Defined Networking (SDN) QoS.

What does that mean in regards to what we already use today? There is no support for native QoS  and vSwitch QoS in Windows Server 2016 TPv3. There is however the  mention of DCB (PFC/ETS ), which is hardware QoS in the TechNet docs on Remote Direct Memory Access (RDMA) and Switch Embedded Teaming (SET). Cool!

But wait a minute. When we look at all kinds of traffic in a converged Hyper-V environment we see CSV (storage traffic), live migration (all variations), backups over SMB3 all potentially leveraging SMB Direct. Due to the features and capabilities in SMB3 I like that. Don’t get me wrong about that. But it also worries me a bit when it comes to handling QoS on the hardware side of things.

In DCB Priority Flow Control (PFC) is the lossless part, Enhanced Transmission Selection (ETS) is the minimum bandwidth QoS part. But how do we leverage ETS when all types of traffic use SMB Direct. On the host it all gets tagged with the same priority. ETS works by tagging different priorities to different workloads and assuring minimal bandwidths out of a total of 100% without reserving it for a workload if it doesn’t need it. Here’s a blog post on ETS with a demo video DCB ETS Demo with SMB Direct over RoCE (RDMA .

Does this mean a SDN QoS only approach to deal with the various type of SMB Direct traffic or do they have some aces up their sleeves?

This isn’t a new “concern” I have but with SET and the sustained push for convergence it does has the potential to become an issue. We already have the SMB bandwidth limitation feature for live migration. That what is used to prevent LM starving CSV traffic when needed. See Preventing Live Migration Over SMB Starving CSV Traffic in Windows Server 2012 R2 with Set-SmbBandwidthLimit.

Now in real life I have rarely, if ever, seen a hard need for this. But it’s there to make sure you have something when needed. It hasn’t caused me issues yet, but I’m a performance & scale first, in “a non-economies of scale” world compared to hosters. As such convergence is a tool I use with moderation. My testing when traffic competes without ETS is that they all get part of the cake but not super predictable/ consistent. SMB bandwidth limitation is a bit of a “bolted on” solution => you can see the perf counters push down the bandwidth in an epic struggle to contain it, but as said it’s a struggle, not a nice flat line.

Also Set-SmbBandwidthLimit is not a percentage, but hard max bandwidth limit, so when you lose a SET member the math is off and you could be in trouble fast. Perhaps it’s these categories that could or will be used but it doesn’t seem like the most elegant solution/approach. That with ever more traffic leveraging SMB Direct make me ever more curious. Some switches offer up to 4 lossless queues now so perhaps that’s the way to go leveraging more priorities … Interesting stuff! My preferred and easiest QoS tool, get even bigger pipes, is an approach convergence and evolution of network needs keeps pushing over. Anyway, I’ll be very interested to see how this is dealt with. For now I’ll conclude my musings On Switch Embedded Teaming, SMB Direct and QoS in Windows Server 2016 Hyper-V

E2EVC 2015 Berlin SMB Direct Slide Deck

I attended and presented at E2EVC 2015 in Berlin from June 12th to June 14th. The networking was a blast. No “marchitecure” bull shit or vendor fairy tales what so ever and lots of very open discussions on the realities we’re seeing and facing in virtualization and cloud. Most account managers and esoteric presales would die a painful (but fast) death in this environment.

image

One session was with my Hyper-V Amigo buddy Carsten Rachfahl and was pure demo extravaganza, so no slides. My own session was “SMB Direct – The Secret Decoder Ring” and was an attempt to position this technology what by looking at the why and where followed by the how by who and when.

image

I hope a lot of people had at least a better understanding of SMB Direct, RDMA and DCB. The second aim was to take away the fear many people have of this tech by showcasing it in short demos. Time constraints where a challenge so it was not a 200 level session.

Please download the presentation here if interested.

Enjoy. If you have any concerns or questions, ask, and I’ll try to answer.

Presenting at ITProceed 2015 & E2EVC 2015 Berlin on SMB Direct

You cannot afford to ignore SMB3 and it’s capabilities related to storage traffic such as multichannel, RDMA and encryption. SMB Direct over RoCE seems to have a bright future as it continuous to evolve and improve in Windows Server 2106. The need for DCB (PFC and optionally ETS) intimidates some people. But it should not.

I’ll be putting SMB Direct & RoCE into perspective at ITPROCEED | Welcome to THE IT Pro Event of the year! and #E2EVC E2EVC 2015 Berlin, June 12-14, 2015 Berlin, Germany, sharing experiences, tips and demos!  Come see PFC & ETS in action and learn what it can do for you. Storage vendors should most certainly consider supporting all features of SMB 3 natively as a competitive advantage. So Join me for the talk “SMB Direct – The Secret Decoder Ring”.

All these talks are at extremely affordable community driven events to make sure you can attend. The sessions are given by speakers who do this for the community (speakers and attendees do this in their own time and pay for their our own travel/expenses) and who work with these technology in real life and provide feedback to vendors on the issues or opportunities we see. This makes the sessions very interesting and anything but marketing, slide ware or sales pitches. See you there!

Hyper-V Amigos Showcast Episode 9 – RDMA, RoCE, PFC and ETS

Just before Carsten Rachfahl and I left for Microsoft Ignite we recorded episode 9 of the Hyper-V Amigo Showcast. In this episode we’ll discuss SMB Direct over RoCE (RDMA over Converged Ethernet) which requires lossless Ethernet.

image

Data Center Bridging is the way to achieve this. It has four standards, PFC (802.1Qbb), ETS (802.1Qaz), CN (802.1Qau) and DCBx, but only two are important to us now.Priority Flow Control (PFC) is mandatory

image

and Enhanced Transmission Selection is optional (but very handy depending on your environment).

image

If you need more information on this start with these blogs on the subject. But without further delay here’s Hyper-V Amigos Showcast Episode 9 – RDMA, RoCE, PFC and ETS