SMB Direct: Choosing A Flavor

I often get asked what to buy for implementing SMB Direct. It’s a non trivial question actually and I’m not an expert, nor do I play one on TV.  All joking aside, it’s a classical consulting answer: it depends. I don’t do free consulting in a blog post, even if that was possible, as there are many factors such as the characteristics and futures of your organization. There’s also a lot of FUD & marketing flying around. Basically in real life you only have two vendors: Cheslio (iWarp) and Mellanox (Roce/Infiniband). Hard to say which one is best. You make the best choice for your company and you live with it.

There is talk about other vendors joining the SMB Direct market. But it seems to be taking a while. This is not that strange. I’ve understood that in the early days of this century iWarp got a pretty bad reputation due to the many issues around it. Apparently offloading the TCP/IP stack to the NIC, which is what iWarp does is not an easy endeavor. Intel had some old Net card a couple of years ago but has gotten out of the game. Perhaps they’ll step back in but that might very well take a couple of years.

Other vendors like Broadcom, Emulex & QLogic might be working on solutions but I’m not holding my breath. Broadcom has DCB and has been hinting at RDMA in it’s NICs for many years but as of the writing of this post there is nothing functional out there yet. But bar the slowness (is complexity slowing the process?) it will be very interesting to see what they’ll choose: RoCE or iWarp. That choice might be the most public statement we’ll ever see about what technology seems like the best bet for these companies. But be careful, I have seen technology choices based on working/living with design choices at at another level due to constrictions in hardware & software that are no longer true today. So don’t just do blindly what others do.

Infiniband will remain a bit more of a niche I think and my guess is that RoCE is the big bet of Mellanox for the long term. 10Gbps and higher Ethernet switches are sold to everyone in the world. Infiniband, not so much. Does that make it a bad choice? Nope, it all depends. Just like FC is not a bad choice for everyone today, it depends.

Your options today

The options you have today to do SMB Direct are rather limited and bound to the different flavors and their vendor. Yes vendor not vendors.

  1. iWarp: Chelsio
  2. RoCE: Mellanox (v2 of RoCE has brought routability into the game, which counters one of iWarps biggest advantages, next to operational ease but the no fuss about DCB story might not be 100% correct, the question is if this matters, after all many people do well with iSCSI which is easy but has performance limits).
  3. Infiniband: Mellanox (Qlogic was the only other remaining one, but Intel bought it form them. I have never ever seen Intel Infiniband in the wild.

Note: You can do iWarp (and even RoCE in theory) without DCB but in all realistic high traffic situations you’ll want to implement PFC to keep the experience and results good under load. Especially the ports connecting to the SOFS nodes could other wise potentially drop packets. iWarp, being TCP/IP, will handle dropped packets but possibly at the cost of deteriorated performance. With RoCE you’re basically toast if you lose packets, it should be losses. I’m not too convinced that pure offloaded TCP/IP scales. Let’s face it, what was the big deal about lossless iSCSI => DCB Smile I would really love to see Demartek testing these things out for us.

If you have a smaller environment, no need for routing and minimal politics I have seen companies select Infiniband which per Gbps is very cheap. Lots of people have chosen iWarp due to it simplicity (which they heavily market) and routability. The popularity however has dropped due to prices hikes that came with increased demand and no competition. RoCE  is popular (I see it the most) and affordable but for this one you MUST do at least PFC. DCB support on switches is not an issue, even budget friendly DELL PowerConnect N4000 series supports it as did it’s predecessor the PC8100 series. Meaning if you have bought switches in the past 24 months and did your home work you’re good to go. Are routability and distance important? Well perhaps not that much today but as the trend in networking is heading for layer 3 down to the rack which will be more acceptable when we see a lot of the workload goodness in hypervisors (Live Migration, vMotion,yes there is work being done on that) being lit up in layer 3 it might become a key feature.

More Tips On Dealing With Removing Short File Names When Migrating To a SMB3 Transparent Failover File Server Cluster

You might have read my blog posts on the capabilities and the process of migrating to a Transparent Failover File Server. If not, here they are:

These are a good read with some advice from real world experience and in this post I’ll offer some more tips. I’ve discussed the need to disable and get rid of short file names in my blog and offered other tips to prepare for your migration and get your file share LUNs in tip top, modern shape. But what if you run into short file name issues where you can seem to get rid of them?

Well here’s 3 more things to check:

1) Get rid of the shadow copies used for Previous Versions

The reason you’d better get rid of them is that they can also contain short files names & way to long path or file names. We don’t want them to ruin the party so we remove them all by disabling shadow copies on the LUNs to be copied. We can enable them again once the LUN is up and running in the new file cluster.

2) The logs indicate there are short file names you don’t have access to

If the NFTS permissions on the folder & file structure are OK you should not have to much problems bar some files being locked by being in use. Rerunning the fsutil command prior to migrating with the server service stopped will prevent any connectivity and use of file shares by people ignoring the request to log of or shut down their clients or automated jobs that otherwise keep accessing them.

But you might still get some indications in the log file(s) that state you can remove certain file names.

image

There is the good old trick of running your command under SYSTEM. That those the job! That helps get rid of short file name instances of folders where you normally don’t get access to. If system has rights you’ll be fine whether it’s a system folder or not.To do this the Sysinternals tools come in handy once again. You can launch a command prompt running under the NT AUTHORITYSYSTEM account using psexec.exe by running the following from a elevated command prompt:

psexec -i -s cmd.exe or psexec  -s cmd.exe

image

The-s switch runs the remote process in the System account. Psexec temporarily installs a service "psexec running psexesvc.exe" on the remote computer (or locally if that’s what you doing) which is removed when the app or process that’s running is closed. It’s obvious now I hope why you need an elevated command prompt to run this command.

Now should you do this by default? Nope. Just when you need to and as always have a realistic backup plan, a way to recover when things go south.

3) Anti virus sometime prevents the removal of short file names

Disable Anti-Virus, sometimes it holds a temporary entry in the registry for the file involved. At least that’s what I’ve seen as a transient issue in some of the large number of logs I gathered. Yeah, I ran a lot of fsutil against large NTFS volumes. What can I say. Due diligence pays off!

4) Run ChkDsk

Just make sure the volume is healthy and no repairs are needed. If your migrating from and older file server there might be outstanding issues and a check disk on volumes with lot’s of files take time. Some of the ones I’ve dealt with had more that 2 million files on a 2TB LUN and it it can take 24 hours. Fun when you have 10 LUNs :-/