SMB 3.0 Multichannel Auto Configuration In Action With RDMA / SMB Direct

Most of you might remember this slide by Jose Barreto on SMB Multichannel  Auto Configuration in one of his many presentations:image

  • Auto configuration looks at NIC type/speed => Same NICs are used for RDMA/Multichannel (doesn’t mix 10Gbps/1Gbps, RDMA/non-RDMA)
  • Let the algorithms work before you decide to intervene
  • Choose adapters wisely for their function

You can fine tune things if and when needed (only do this when this is really the case) but let’s look at this feature in action.

So let’s look at this in real life. For this test we have 2 * X520 DA 10Gbps ports using 10.10.180.8X/24 IP addresses and 2 * Mellanox  10Gbps RDMA adaptors with 10.10.180.9X/24 IP addresses. No teaming involved just multiple NIC ports. Do not that these IP addresses are on different subnet than the LAN of the servers. Basically only the servers can communicate over them, they don’t have a gateway, no DNS servers and are as such not registered in DNS either (live is easy for simple file sharing).

image

Let’s try and copy a 50Gbps fixed VHDX file from server1 to server2 using the DNS name of the target host (pixelated), meaning it will resolve to that host via DNS and use the LAN IP address 10.10.100.92/16 (the host name is greyed out). In the below screenshot you see that the two RDMA capable cards are put into action. The servers are not using  the 1Gbps LAN connection. Multichannel looked at the options:

  • A 1Gbps RSS capable Link
  • Two 10Gbps RSS capable Links
  • Two 10Gbps RDMA capable links

Multichannel concluded the RDMA card is the best one available and as we have two of those it use both. In other words it works just like described.

image

Even if we try to bypass DNS and we copy the files explicitly via the IP address (10.10.180.84)  assigned to the Intel X520 DA cards Multichannel intelligence detects that it has two better cards  that provide RDMA available and as you can see it uses the same NICs  as in the demo before.  Nifty isn’t it Smile

 image

If you want to see the other NICs in action we can disable the Mellanox card and than Multichannel will choose the two X520 DA cards. That’s fine for testing but in real life you need a better solution when you need to manually define what NICs can be used. This is done using PowerShell Smile (take a look at Jose Barrto’s blog The basics of SMB PowerShell, a feature of Windows Server 2012 and SMB 3.0  for more info).

New-SmbMultichannelConstraint –ServerName SERVER2 –InterfaceAlias “SLOT 6 Port 1”, “SLOT 6 Port 2”

This tells a server it can only use these two NICs which in this example are the two Intel X520 DA 10Gbps cards to access Server2. So basically you configure/tell the client what to use for SMB 3.0 traffic to a certain server. Note the difference in send/receive traffic between RDMA/Native 10Gbps.

On Server1, the client you see this:

image

On Server2, the server you see this:

image

Which is indeed the constraint set up as we can verify with:

Get-SmbMultichannelConstraint

image

We’re done playing so let’s clean up all the constraints:

Get-SmbMultichannelConstraint | Remove-SmbMultichannelConstraint

image

Seeing this technology it’s now up to the storage industry to provide the needed  capacity and IOPS I a lot more affordable way. Storage Spaces have knocked on your door, that was the wake up call Winking smile. In an environment where we throw lots of data around we just love SMB 3.0

Design Considerations For Converged Networking On A Budget With Switch Independent Teaming In Windows Server 2012 Hyper-V

Last Friday I was working on some Windows Server 2012 Hyper-V networking designs and investigating the benefits & drawbacks of each. Some other fellow MVPs were also working on designs in that area and some interesting questions & answers came up (thank you Hans Vredevoort for starting the discussion!)

You might have read that for low cost, high value 10Gbps networks solutions I find the switch independent scenarios very interesting as they keep complexity and costs low while optimizing value & flexibility in many scenarios. Talk about great ROI!

So now let’s apply this scenario to one of my (current) favorite converged networking designs for Windows Server 2012 Hyper-V. Two dual NIC LBFO teams. One to be used for virtual machine traffic and one for other network traffic such as Cluster/CSV/Management/Backup traffic, you could even add storage traffic to that. But for this particular argument that was provided by Fiber Channel HBAs. Also with teaming we forego RDMA/SR-IOV.

For the VM traffic the decision is rather easy. We go for Switch Independent with Hyper-V Port mode. Look at Windows Server 2012 NIC Teaming (LBFO) Deployment and Management to read why. The exceptions mentioned there do not come into play here and we are getting great virtual machine density this way. With lesser density 2-4 teamed 1Gbps ports will also do.

But what about the team we use for the other network traffic. Do we use Address hash or Hyper-V port mode. Or better put, do we use native teaming with tNICs as shown below where we can use DCB or Windows QoS?

image

Well one drawback here with Address Hash is that only one member will be used for incoming traffic with a switch independent setup. Qos with DCB and policies isn’t that easy for a system admin and the hardware is more expensive.

So could we use a virtual switch here as well with QoS defined on the Hyper-V switch?

image

Well as it turns out in this scenario we might be better off using a Hyper-V Switch with Hyper-V Port mode on this Switch independent team as well. This reaps some real nice benefits compared to using a native NIC team with address hash mode:

  • You have a nice load distribution of the different vNIC’s send/receive traffic over a single member of the NIC team per VM. This way we don’t get into a scenario where we only use one NIC of the team for incoming traffic. The result is a better balance between incoming and outgoing traffic as long an none of those exceeds the capability of one of the team members.
  • Easy to define QoS via the Hyper-V Switch even when you don’t have network gear that supports QoS via DCB etc.
  • Simplicity of switch configuration (complexity can be an enemy of high availability & your budget).
  • Compared to a single Team of dual 10Gbps ports you can get a lot higher number of VM density even they have rather intensive network traffic and the non VM traffic gets a lots of bandwidth as well.
  • Works with the cheaper line of 10Gbps switches
  • Great TCO & ROI

With a dual 10Gbps team you’re ready to roll. All software defined. Making the switches just easy to use providers of connectivity. For smaller environments this is all that’s needed. More complex configurations in the larger networks might be needed high up the stack but for the Hyper-V / cloud admin things can stay very easy and under their control. The network guys need only deal with their realm of responsibility and not deal with the demands for virtualization administration directly.

I’m not saying DCB, LACP, Switch Dependent is bad, far from. But the cost and complexity scares some people while they might not even need. With the concept above they could benefit tremendously from moving to 10Gbps in a really cheap and easy fashion. That’s hard (and silly) to ignore. Don’t over engineer it, don’t IBM it and don’t go for a server rack phD in complex configurations. Don’t think you need to use DCB, SR-IOV, etc. in every environment just because you can or because you want to look awesome. Unless you have a real need for the benefits those offer you can get simplicity, performance, redundancy and QoS in a very cost effective way. What’s not to like. If you worry about LACP etc. consider this, Switch independent mode allows for nearly no service down time firmware upgrades compared to stacking. It’s been working very well for us and avoids the expense & complexity of vPC, VLT and the likes of that. Life is good.

Windows Server 2012 NIC Teaming Mode “Independent” Offers Great Value

There, I said it. In switching, just like in real life, being independent often beats the alternatives. In switching that would mean stacking. Windows Server 2012 NIC teaming in Independent mode, active-active mode makes this possible. And if you do want or need stacking for link aggregation (i.e. more bandwidth) you might go the extra mile and opt for  vPC (Virtual Port Channel a la CISCO) or VTL (Virtual Link Trunking a la Force10 – DELL).

What, have you gone nuts? Nope. Windows Server 2012 NIC teaming gives us great redundancy with even cheaper 10Gbps switches.

What I hate about stacking is that during a firmware upgrade they go down, no redundancy there. Also on the cheaper switches it often costs a lot of 10Gbps ports (no dedicated stacking ports). The only way to work around this is by designing your infrastructure so you can evacuate the nodes in that rack so when the stack is upgraded it doesn’t affect the services. That’s nice if you can do this but also rather labor intensive. If you can’t evacuate a rack (which has effectively become your “unit of upgrade”) and you can’t afford the vPort of VTL kind of redundant switch configuration you might be better of running your 10Gbps switches independently and leverage Windows Server 2012 NIC teaming in a switch independent mode in active active. The only reason no to so would be the need for bandwidth aggregation in all possible scenarios that only LACP/Static Teaming can provide but in that case I really prefer vPC or VLT.

Independent 10Gbps Switches

Benefits:

  • Cheaper 10Gbps switches
  • No potential loss of 10Gbps ports for stacking
  • Switch redundancy in all scenarios if clusters networking set up correctly
  • Switch configuration is very simple

Drawbacks:

  • You won’t get > 10 Gbps aggregated bandwidth in any possible NIC teaming scenario

Stacked 10Gbps Switches

Benefits:

  • Stacking is available with cheaper 10Gbps switches (often a an 10Gbps port cost)
  • Switch redundancy (but not during firmware upgrades)
  • Get 20Gbps aggregated bandwidth in any scenario

Drawbacks:

  • Potential loss of 10Gbps ports
  • Firmware upgrades bring down the stack
  • Potentially more ‘”complex” switch configuration

vPC or VLT 10Gbps Switches

Benefits:

  • 100% Switch redundancy
  • Get > 10Gbps aggregated bandwidth in any possible NIC team scenario

Drawbacks:

  • More expensive switches
  • More ‘”complex” switch configuration

So all in all, if you come to the conclusion that 10Gbps is a big pipe that will serve your needs and aggregation of those via teaming is not needed you might be better off with cheaper 10Gbps leverage Windows Server 2012 NIC teaming in a switch independent mode in active active configuration. You optimize 10Gbps port count as well. It’s cheap, it reduces complexity and it doesn’t stop you from leveraging Multichannel/RDMA.

So right now I’m either in favor of switch independent 10Gbps networking or I go full out for a vPC (Virtual Port Channel a la CISCO) or VTL (Virtual Link Trunking a la Force10 – DELL) like setup and forgo stacking all together. As said if you’re willing/capable of evacuating all the nodes on a stack/rack you can work around the drawback. The colors in the racks indicate the same clusters. That’s not always possible and while it sounds like a great idea, I’m not convinced.

image

When the shit hits the fan … you need as little to worry about as possible. And yes I know firmware upgrades are supposed to be easy and planned events. But then there is reality and sometimes it bites, especially when you cannot evacuate the workload until you’re resolved a networking issue with a firmware upgrade Confused smile Choose your poison wisely.

Exploring Hyper-V Virtual Switch Port Mirroring

Windows Server 2012 brings us many new capabilities and one of those is port mirroring. You can now configure a virtual machine NIC (vNIC) who’s traffic you want to monitor as the source in the Advanced Features of the Network Adapter settings. The vNIC of the virtual machine where you’ll run a network sniffer, like Network Monitor or WireShark, against is set to “Destination”. It’s pretty much that simple to set up. Easy enough.

On the vNIC you want to monitor the traffic to and from the VM, under Settings, Network Adapter (choose the correct one), under Advanced Features you select “Source” as Mirroring mode. In this example we’re going to monitor data traffic to and from the guest Columbia.image

On the destination VM we have a dedicated vNIC set up called “Sniffie”image

On the guest VM Pegasus, where we’ll capture the network traffic via a dedicated vNIC (“Sniffie”), we set that vNIC (virtual port) to “Destination” as Mirroring node:image

So now let’s start pinging a host (ping –t crusader)  on our Source VM  Columbiaimage

And take a look on the Destination vNIC on virtual machine Pegasus where we’re capturing the traffic. The “Sniffie” NIC there is set to destination as Mirror Mode. Look at the ICMP echo reply from form 192.168.2.32 (Crusader host). Columbia is at 192.168.2.122 sending out the ICMP echo request.image

Pretty cool!

Some Technicalities

So deep down under the hood, it’s the switch extension capabilities  of the Hyper-V virtual switch that are being leveraged to achieve port sniffing. This is just one of the many functionalities that the Hyper-V extensible switch enables. The Hyper-V extensible switch itself uses port ACLs to set a rule that forwards traffic from one  virtual port to another virtual port. For practical reasons translate virtual port to vNIC in a VM and this translates into what we shown above. While it’s good to know that port ACLs are what is used by the extensible switch to do enable all kinds of advances features like port mirroring but you don’t need to worry about the details to use it.

Things to note

Initially many of us made the assumption that we’d be able to sniff the traffic form a virtual port to a port on their physical switch. This is not the case. Basically, in box, it’s a source VM that mirrors it’s network traffic form one or more virtual ports (vNICs) to a destination VM’s one or more virtual ports (vNIC).

You can send many sources to one destination. That’s fine. You could also define more destinations on the same host but that’s not really wise and practical as far as I can see. All in all, you set it up on  when needed on the source VM and you keep a destination VM with a sniffer around for the sniffing.

Also keep in mind that all this works within the boundaries of the same host. Which means that if you want to monitor a VMs network traffic when it moves across nodes in a cluster you’ll have to have "destination” virtual machine on each host. This means that when a source VM is live migrated it will mirror the traffic to that local destination VM. That works.

You could try and live migrate source & destination VMs to the same host but this is not feasible in real life. For one the capture doesn’t survive after a life migration as your sniffer loses connectivity to virtual Port / vNIC.image

Don’t be too disappointed about this. Port mirroring is not meant to be a permanent situation that you need to keep highly available anyway, bar some special environments/needs.

Whilst is it true that out of the box you can’t do stuff like sending the mirrored traffic form a guests vNIC/virtual port to a physical switch port where you attach your network sniffer laptop or so. If you throw on the CISCO Nexus 1000V it replaces the Microsoft in box “Forwarding Extensions” and than it’s up to CISCO’s implementation to determine what you can or can’t do. As this stuff is right up their sleeve they allow the Cisco Nexus 1000V mirrors traffic sent between virtual machines by sending ERSPAN to an external Cisco Catalyst switch. I have not had the pleasure of playing working with this.

Anyway, I hope this help to explain things a little. Happy sniffing and don’t get yourself into trouble, follow the rules.