Preliminary Results With Live Migration Over RDMA Speed & Useful Number Of NICs

Introduction

With Windows Server 2012 R2 (Preview) we can leverage SMB to do Live Migrations. That means we can now offload the process to the NIC if they support RDMA, save on CPU cycles and potentially get VMs moves a lot faster without impacting the performance of running VMs on the involved hosts. Perhaps it’s even faster than over TCP/IP. Sounds great so let’s do some testing.

  • We have a dual port 10Gbps Mellanox RDMA card (RoCE) in each host. One pair of the ports are interconnected via a direct attach cable. The other one is connected over a Force10 S4810 switch. We’re using in box Windows Server 2012 R2 preview drivers for everything as we have found drivers not to install properly (or not at all) on this release and cause issues.
  • We are using one VM running Windows 2012 RTM with upgraded Integration Services components. This VM has 4 vCPUs and 55GB of fixed memory assigned. For this purpose we had no workload running in the VM. The servers are standard DELL PowerEdge R720 kit running the Windows Server 2012 Preview bits.

Results

No Performance tweaking

Live Migration over RDMA in action. Here we are using 1 10Gbps RoCE RDMA NIC. Here we are moving via the NIC port that goes over the S4810 Switch.image

As you can see the entire process took 74 seconds. RDMA did not kick in until after 19 seconds had past since the start.

The CPU load remains low, which is where you’ll find the biggest benefit of RDMA  with live migrations.image

No let’s put two RDMA RoCE ports into play and see what that does for us. We now Live Migrate the 55GB memory VM in 52-54 seconds. Not bad. Again we saw over 20 seconds time pass before RDMA kicks in.image

Again we see that CPU usage remains low. This is just a quick screenshot. On a hyper-V node you’ll need to dive into Performance Monitor to get some real info.image

Let’s repeat this exercise and see what happens if we move the traffic over the NOC ports that are directly attached. That will give us an indication about the configuration of the switch. Configuring RoCE DCB  features like PFC/ETS is not exactly a well documented process at the moment and often I feel like a magician’s apprentice.

Once more we see that it takes about 20 seconds for RDMA to kick in and that the time rises to 79 seconds. It fluctuates between 74-79 seconds actually?

image

The CPU load was low again. So both paths seem to perform comparable.

Live Migrations over SMB seem to function faster using two RDMA ports  but not twice as fast. These are the preview bits so nothing definitive yet. And sorry, I cannot do 40Gbps or 56Gbps Infiniband tests. Unless you want to donate the gear and pay for the power, time  & reporting Open-mouthed smile.

Max Performance Tweaking

As my readers very well know I tweak my nodes for best performance. The savings of energy (power, cooling) have to come from making the most out of every node and shutting them down when not needed (Dynamic Optimization/Power Optimization in System Center). I still have a standing order to tale away any physical limitations possible for the business.

While Windows Server 2012 (R2) has made tremendous strides to better use of the available bandwidth of a 10Gbps pipes out of the box I still dive in to the BIOS to turn of the C/C1E states and set the CPU Power Management and Memory Frequency to Maximum performance. Have a look at this blog post Still Need To Optimizing Power Settings On DELL 12th Generation Servers For Lightning Fast Hyper-V Live Migrations? on how to do this with DELL Generation 12 Servers. It also contains a  link to the older generations guidance.

As you can imagine I was quite interested to see if the settings effect RDMA as well. So let’s have a look with these settings here:

image

One RDMA NIC used (Mellanox, RoCE, 10Gbps)image

54 seconds for that 55GB memory (fixed) VM. We also note that the delay of 19-20 seconds before RDMA kicks in has dropped to 3-4 seconds, which is quite interesting. Basically this makes it as fast as 2 RDMA NICs without performance tweaking.

Two RDMA NICs used (Mellanox, RoCE, 10Gbps)image

30 seconds flat, in a repetitive manner, for that 55GB memory (fixed) VM. Again we note that the delay of 19-20 seconds before RDMA kicks in is again 3-4 seconds. So this is about 45% better than without the power optimization.

What is the CPU doing during all this? Well taking care of the VM load, not spending it on network interrupts Smile. Again, this is a quick screenshot. On a hyper-V node you’ll need to dive into Performance Monitor to get some real info.image

By now you must all be eager to see how this compares against Live Migration over TCP/IP, Multichannel and with Compression. That’s material for other blogs.

Why am I doing this?

We need to get the most out of every € or $ we spent. It’s not that we don’t have any cash left or so but why buy more servers & higher end gear to get better results when the answer lies in the correct configuration & better choices when designing a solution. It’s going to be a while before this knowledge becomes main stream and widely available. Years probably and why wait. It takes time to experiment but the results & ROI are great. Why spend another 50.000 to another 100.000 Euro on Servers, 10Gbps cards & switch ports if you don’t need to?  Count the cost to host, power & cool them and you’ll see that this time is an investment. You could also conclude to leverage the cloud but wasting VM cycles there is also money you have better uses for, so testing will also be needed.

Configuring Performance Options for Live Migration In Windows Server 2012 R2 Preview

New Options For Optimizing Live Migrations

In Windows Server 2012 R2 we have a whole range of options to leverage Live Migration of our environment and needs. Next to the new default (Compression) we can now also leverage SMB 3.0 (Multichannel, RDMA) for all forms of Live Migration and not just for Shared Nothing Live Migration  (see  Shared Nothing Live Migration Leverages SMB 3.0 Under the Hood) or Storage Live Migration when both the source and the target are SMB 3.0 storage.

TCP/IP

Here you can use a one NIC or a NIC Team for bandwidth aggregation for live migration (see  Teamed NIC Live Migrations Between Two Hosts In Windows Server 2012 Do Use All Members). This is the process you have known in Windows Server 2012. You can select multiple NICs or even Teams of NICs  but only one of those (one NIC or one Team) will be used. The other(s)will only be used when the first one is not available.

Compression

This option leverages spare CPU cycles to compress the memory contents of virtual machines being migrated. Only then is it sent over the wire via TCP/IP connection. This speeds up the Live Migration Process. This process is CPU load aware so it will only use idle cycles to protect the workload on the hosts. This is the default setting in Hyper-V running on Windows Server 2012 R2 Preview.

SMB

This setting will leverage two SMB 3.0 features. Multichannel and, if supported by and for the NICs involved, RDMA.

  • SMB Direct (RDMA) will be used when the network adapters of both the source and destination servers have Remote Direct Memory Access (RDMA) capabilities enabled.
  • SMB Multichannel will automatically detect and use multiple connections when a proper SMB Multichannel configuration is identified.

Where to set these options?

In Hyper-V Manager go to “Hyper-V Settings” in the Actions pane.image

Expand the Live Migrations node under Server in the left pane (click the “+”) and select to “Advanced Features”.image

Select the option desired under" “Performance Options”.image

Happy testing!

 

EDIT: Aidan Finn posted the PowerShell commands to configure the performance options in Configuring WS2012 R2 Hyper-V Live Migration Performance Options Using PowerShell The MVP community at work & it rocks Smile

DELL PowerEdge VRTX Has Potential Beyond ROBO As a Scale Out File Server Building Block

The PowerEdge VRTX

I went to have a chat with Dell at TechEd 2013 Europe in Madrid. The VRTX was launched during DELL Enterprise Forum early June 2013 this concept packs a punch and I encourage you to go look at the VRTX (pronounced as “Vertex”) in more detail here. It’s a very quite setup which can be hooked up to standard power. Pretty energy efficient when you consider the power of the VRTX. And the entire setup surely packs a lot of punch at an attractive price point.

image image

It can serve perfectly for Remote Office / Branch Office (ROBO) deployments but has many more use cases as it’s very versatile. In my humble opinion DELL’s latest form factor could be used for some very nice scale out scenarios. It’s near perfect for a Windows Server 2012 Scale Out File Server (SOFS) building block. While smaller ones can be build using 1Gbps the future just needs 10Gbps networking.

10Gbps, RDMA (iWarp, RoCE)

That’s the first thing I missed and the first thing I was told that would arrive very soon. So I’m  very happy with that. With sufficient 10Gbps ports to servers and  iWarp or RoCE RDMA capable NICs (there’s cheap enough compared to ordinary 10Gbps cards not to have to leave that capability out) we have all we need to function as powerful building block for the Scale Out File Server model with Windows Server 2012 (R2) where the CSV network becomes the storage network leveraging redirected IO. For this concept look at this picture from a presentation a year ago. SMB 3.0 Multichannel and RDMA make this possible.

image

While then I drew the SOFS building blocks out of R720 7 & MD1200  hardware, the VRTX could fit in there perfectly!

Storage Options

Today DELL uses their implementation of Clustered PCI Raid for shared storage which is supported since Windows Server 2012. This is great. For the moment it’s a non redundant setup but a redundant one is in the works I’m told. Nice, but think positive, redirected IO  (block level) over SMB 3.0 would save our storage IO even today. It would be a very wise and great addition to the capabilities of this building block to add the option & support for Storage Spaces. This would make the Scale Out File Server concept shine with the VRTX.

Why? Well I would give us following benefits in the storage layer and the VHDX format in Hyper-V can take benefits :

  • Deduplication
  • Thin Provisioning
  • Management Delegation
  • UNMAP
  • Write Cache
  • Full benefits of ReFS on storage spaces for data protection
  • Automatic Data Tiering with commodity SSD (ever cheaper & bigger) and SAS disks perhaps even the Near Line ones (less power & cooling, great capacity)
  • Potential for JBOD redundancy

Look at that feature set people, in box, delivered by Windows. Sweet! Combine this with 10Gbps networking and DELL has not only a SOFS building block in their port folio, it also offers significant storage features in this package. I for one would like them to do so and not miss out on this opportunity to offer even more capabilities in an attractive price package. Dell could be the very first OEM to grab this new market opportunity by supporting the scale out approach and out maneuver their competitors

Anything Else?

Combine such a building block as described above with their unmatched logistical force for distribution and support this will be a hit a a prime choice for Windows shops. They already have the 10Gbps networking gear & features (DCB) in the PowerConnect 81XX & Force10 S4810 switches. It could be an unbeatable price / capabilities / feature combo that would sell very well.

If we go for SOFS we might need more storage in a single building block with a 4 node cluster. Extensibility might be nice for this. More not just as in capacity but I need to work out the IOPS the available configurations can give us.

TechEd 2013 Revelations for Storage Vendors as the Future of Storage lies With Windows 2012 R2

Imagine you’re a storage vendor until a few years ago. Racking in the big money with profit margins unseen by any other hardware in the past decade and living it up in dreams along the Las Vegas Boulevard like there is no tomorrow. To describe your days only a continuous “WEEEEEEEEEEEEEE” will suffice.

clip_image002

Trying to make it through the economic recession with less Ferraris has been tough enough. Then in August 2012 Windows Server 2012 RTMs and introduces Storage Spaces, SMB 3.0 and Hyper-V Replica. You dismiss those as toy solutions while the demos of a few 100.000 to > million IOPS on the cheap with a couple of Windows boxes and some alternative storage configurations pop up left and right. Not even a year later Windows Server 2012 R2 is unveiled and guess what? The picture below is what your future dreams as a storage vendor could start to look like more and more every day while an ice cold voice sends shivers down your spine.

clip_image003

“And I looked, and behold a pale horse: and his name that sat on him was Death, and Hell followed with him.”

OK, the theatrics above got your attention I hope. If Microsoft keeps up this pace traditional OEM storage vendors will need to improve their value offerings. My advice to all OEMs is to embrace SMB3.0 & Storage Spaces. If you’re not going to help and deliver it to your customers, someone else will. Sure it might eat at the profit margins of some of your current offerings. But perhaps those are too expensive for what they need to deliver, but people buy them as there are no alternatives. Or perhaps they just don’t buy anything as the economics are out of whack. Well alternatives have arrived and more than that. This also paves the path for projects that were previously economically unfeasible. So that’s a whole new market to explore. Will the OEM vendors act & do what’s right? I hope so. They have the distribution & support channels already in place. It’s not a treat it’s an opportunity! Change is upon us.

What do we have in front of us today?

  • Read Cache? We got it, it’s called CSV Cache.
  • Write cache? We got it, shared SSDs in Storage spaces
  • Storage Tiering? We got it in Storage Spaces
  • Extremely great data protection even against bit rot and on the fly repairs of corrupt data without missing a beat. Let me introduce you to ReFS in combination with Storage Spaces now available for clustering & CSVs.
  • Affordable storage both in capacity and performance … again meet storage spaces.
  • UNMAP to the storage level. Storage Spaces has this already in Windows Server 2012
  • Controllers? Are there still SAN vendors not using SAS for storage connectivity between disk bays and controllers?
  • Host connectivity? RDMA baby. iWarp, RoCE, Infiniband. That PCI 3 slot better move on to 4 if it doesn’t want to melt under the IOPS …
  • Storage fabric? Hello 10Gbps (and better) at a fraction of the cost of ridiculously expensive Fiber Channel switches and at amazingly better performance.
  • Easy to provision and manage storage? SMB 3.0 shares.
  • Scale up & scale out? SMB 3.0 SOFS & the CSV network.
  • Protection against disk bay failure? Yes Storage Spaces has this & it’s not space inefficient either Smile. Heck some SAN vendors don’t even offer this.
  • Delegation capabilities of storage administration? Check!
  • Easy in guest clustering? Yes via SMB3.0 but now also shared VHDX! That’s a biggie people!
  • Hyper-V Replication = free, cheap, effective and easy
  • Total VM mobility in the data center so SAN based solutions become less important. We’ve broken out of the storage silo’s

You can’t seriously mean the “Windoze Server” can replace a custom designed SAN?

Let’s say that it’s true and it isn’t as optimized as a dedicated storage appliance. So what, add another 10 commodity SSD units at the cost of one OEM SSD and make your storage fly. Windows Server 2012 can handle the IOPS, the CPU cycles, memory demands in both capacity and speed together with a network performance that scales beyond what most people needs. I’ve talked about this before in Some Thoughts Buying State Of The Art Storage Solutions Anno 2012. The hardware is a commodity today. What if Windows can and does the software part? That will wake a storage vendor up in the morning!

Whilst not perfect yet, all Microsoft has to do is develop Hyper-V replica further. Together with developing snapshotting & replication capabilities in Storage Spaces this would make for a very cost effective and complete solution for backups & disaster recoveries. Cheaper & cheaper 10Gbps makes this feasible.  SAN vendors today have another bonus left, ODX. How long will that last? ASIC you say. Cool gear but at what cost when parallelism & x64 8 core CPUs are the standard and very cheap. My bet is that Microsoft will not stop here but come back to throw some dirt on a part of classic storage world’s coffin in vNext. Listen, I know about the fancy replication mechanisms but in a virtualized data center the mobility of VM over the network is a fact. 10Gbps, 40Gbps, RDMA & Multichannel in SMB 3.0 puts this in our hands. Next to that the application level replication is gaining more and more traction and many apps are providing high availability in a “shared nothing“ fashion (SQL/Exchange with their database availability groups, Hyper-V, R-DFS, …). The need for the storage to provide replication for many scenarios is diminishing. Alternatives are here. Less visible than Microsoft but there are others who know there are better economies to storage http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/ & http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/.

The days when storage vendors offered 85% discounts on hopelessly overpriced storage and still make a killing and a Las Vegas trip are ending. Partners and resellers who just grab 8% of that (and hence benefits from overselling as much a possible) will learn just like with servers and switches they can’t keep milking that cash cow forever. They need to add true and tangible value. I’ve said it before to many VARs have left out the VA for too long now. Hint: the more they state they are not box movers the bigger the risk that they are. True advisors are discussing solutions & designs. We need that money to invest in our dynamic “cloud” like data centers, where the ROI is better. Trust me no one will starve to death because of this, we’ll all still make a living. SANs are not dead. But their role & position is changing. The storage market is in flux right now and I’m very interested in what will happen over the next years.

Am I a consultant trying to sell Windows Server 2012 R2 & System Center? No, I’m a customer. The kind you can’t sell to that easily. It’s my money & livelihood on the line and I demand Windows Server 2012 (R2) solutions to get me the best bang for the buck. Will you deliver them and make money by adding value or do you want to stay in the denial phase? Ladies & Gentleman storage vendors, this is your wake-up call. If you really want to know for whom the bell is tolling, it tolls for thee. There will be a reckoning and either you’ll embrace these new technologies to serve your customers or they’ll get their needs served elsewhere. Banking on customers to be and remain clueless is risky. The interest in Storage Spaces is out there and it’s growing fast. I know several people actively working on solutions & projects.

clip_image005

clip_image007

You like what you see? Sure IOPS are not the end game and a bit of a “simplistic” way to look at storage performance but that goes for all marketing spin from all vendors.

clip_image008

Can anyone ruin this party? Yes Microsoft themselves perhaps, if they focus too much on delivering this technology only to the hosting and cloud providers. If on the other hand they make sure there are feasible, realistic and easy channels to get it into the hands of “on premise” customers all over the globe, it will work. Established OEMs could be that channel but by the looks of it they’re in denial and might cling to the past hoping thing won’t change. That would be a big mistake as embracing this trend will open up new opportunities, not just threaten existing models. The Asia Pacific is just one region that is full of eager businesses with no vested interests in keeping the status quo. Perhaps this is something to consider? And for the record I do buy and use SANs (high-end, mid-market, or simple shared storage). Why? It depends on the needs & the budget. Storage Spaces can help balance those even better.

Is this too risky? No, start small and gain experience with it. It won’t break the bank but might deliver great benefits. And if not .. there are a lot of storage options out there, don’t worry. So go on Winking smile