Teamed NIC Live Migrations Between Two Hosts In Windows Server 2012 Do Use All Members

Posted on June 30, 2013 by workinghardinit

Introduction

Between this blog NIC Teaming in Windows Server 2012 Brings Simple, Affordable Traffic Reliability and Load Balancing to your Cloud Workloads which states “TCP/IP can recover from missing or out-of-order packets. However, out-of-order packets seriously impact the throughput of the connection. Therefore, teaming solutions make every effort to keep all the packets associated with a single TCP stream on a single NIC so as to minimize the possibility of out-of-order packet delivery. So, if your traffic load comprises of a single TCP stream (such as a Hyper-V live migration), then having four 1Gb/s NICs in an LACP team will still only deliver 1 Gb/s of bandwidth since all the traffic from that live migration will use one NIC in the team. However, if you do several simultaneous live migrations to multiple destinations, resulting in multiple TCP streams, then the streams will be distributed amongst the teamed NICs” and other information out their such as support forum replies it is dictated that when you live migrate between two nodes in a cluster only one stream is active and you will never exceed the bandwidth of a single team member. When running some simple tests with a 10Gbps NIC team this seems true. We also know that you can consume near to all of the aggregated bandwidth of the members in a NIC Team for live migration if you these conditions are met:

1. The Live Migrations must not all be destined for the same remote machine. Live migration will only use one TCP stream between any pair of hosts. Since both Windows NIC Teaming and the adjacent switch will not spread traffic from a single stream across multiple interfaces live migration between host A and host B, no matter how many VMs you’re migrating, will only use one NIC’s bandwidth.

2. You must use Address Hash (TCP ports) for the NIC Teaming. Hyper-V Port mode will put all the outbound traffic, in this case, on a single NIC.

When we look at these conditions and compare them to the behavior we expect from the various forms of NIC teaming in Windows 2012 this is a bit surprising as one might expect all member to be involved. So let’s take a look at some of the different NIC Teaming setups.

Any form of NIC teaming with Hyper-V Port Mode

This one is easy as condition 2 above is very much true. In all my testing with any NIC team configuration in the Hyper-V Port mode traffic distribution algorithms I have not been able to exceed 10Gbps. I have seen no difference between dependent static of LACP mode or switch independent (active-active) for this condition. As you can see in the screenshot below, the traffic maxes out at 10Gbps.

This is also demonstrated in the following screenshots taking with the resource manager where you can see only half of the bandwidth of the Team is being used.

Exceeding a single NIC team member’s bandwidth when migrating between 2 nodes

The first condition of the previous heading doesn’t seem true. In some easy testing with a low number of virtual machines and not too much memory assigned you never exceed the bandwidth of one 10Gbps NIC team member. So on the surface, with some quick testing it might seem that way.

But during testing on a 2 node cluster with dual port 10Gbps cards and I have found the following

Switch Dependent LACP and Static

Take a sufficient number of large memory virtual machines to exceed the capacity of a single 10Gbps pipe for a longer time (that way you’ll see it in the GUI).
Live migrate them all from host A to host B (“Pause” with “Drain Roles” or “select all” + “Move”)
Note that with a 2 node cluster there is no possibility to Live Migrate to multiple nodes simultaneous. It’s A to or B or B to A or both at the same time.

Basically it didn’t take long to see well over 10Gbpsbeing used. So the information out there seems to be wrong. Yes we can leverage the aggregated bandwidth when we migrate from host A to host B as long as we have enough memory assigned to the VMs and we migrate a sufficient number of them. Switch dependent teaming, whether it is static or LACP does its job as you would expect.

Let’s think about this. The number of VMs you need to lie migrate to see > 10Gbpss used is not fixed in stone. Could it be that there is some intelligence in the Live Migration algorithm where it decides to set up multiple streams when a certain number of virtual machines with sufficient memory are migrated as the sorting is mitigated by the amount of bandwidth that can leveraged? Perhaps he VMMS.EXE kicks off more streams when needed/beneficial? Further experimenting indicates that this is not the case. All you need is > 1 VM being live migrated. When looking at this in task manager you do need them to be of sufficient memory size and/or migrate enough of them to make it visible. I have also tried playing with the number of allowed simultaneous live migrations to see if this has an effect but I did not find one (i.e. 4, 6 or 12).

It looks like it is more like one TCP/IP connection per Live Migration that is indeed tied to one NIC member. So when you live migrate VMS between two hosts you see one VM live migration go over 1 member and the other the other as static/LACP switch dependent teaming did does its job. When you do enough live migrations of large VMs simultaneously you see this in Task Manager as shown below. In this case as each VM live migration stream sticks to a NIC team member you do not need to worry about out of order packets impacting performance.

But to make sure and to prevent falling victim to the fall victim to the limits of the task manger GUI during testing this behavior we also used performance monitor to see what’s going on. This confirms we are indeed using both 10Gbps NIC team member on both the target and the source host server. This is even the case with 2 virtual machines Live Migration. As long as it’s more than one and the memory assigned is enough to make the live migration last long enough you can see it in Task Manager; otherwise it might miss it. Performance Monitor however does not..

This is interesting and frankly a bit unexpected as the documentation on this subject is not reflecting this. However it IS in agreement with the NIC teaming documented behavior for other tan Live Migration traffic. We took a closer look however and can reproduce this over and over again. Again we tested both switch dependent static and LACP modes and we found the behavior to be the same.

Switch Independent with Address Hash

Let’s test Live Migration over switch independent teaming with Address Hash. Here we see that the source server sends on the two member of the NIC team but that the target server receives on only one. This is normal behavior for switch independent teaming. But from the documentation we expect that one member on the source server would send and one member on the target server would receive. Not so.

Basically with Windows Server 2012 this doesn’t give you any benefit for throughput. You are limited to the bandwidth of one member, i.e. 10Gbps.

Red is Total Bytes received on the target host. It’s clear only one member is being used. Green is Bytes Sent/Sec on the source server. As you can see both team members are involved. In a switch independent scenario the receiving side limits the throughput. This is in agreement the documented behavior of switch independent NIC teaming with Address hash.

Helpful documentation on this is Windows Server 2012 NIC Teaming (LBFO) Deployment and Management (A Guide to Windows Server 2012 NIC Teaming for the novice and the expert).

Hope this helps sort out some of the confusion.

Storage Spaces, Hyper-V VM Mobility & 10Gbps are Key Elements

Posted on June 28, 2013 by workinghardinit

At TechEd 2103 Europe I met some passionate people who’ve been working hard at deploying Storage Spaces on a large scale in their companies (many hundreds of TB). They have replaced their expensive SANs and getting better performance, better data protection leveraging ReFS at way cheaper pricing (licensing is a killer). Their life has also become easier due to the KISS principle and the fact that complex solutions have complex issues.

Storage Spaces are creating a buzz. Lot of interest, interactive discussions & paper napkin designs of concepts were being drawn up. It seems like more and more people are getting the message and don’t consider my talks over the past 18 months on this as weird any more. So you need some “out of the box” thinking to get “Cluster in a Box” as a building block for scale out storage solutions. I’ve already done away with SAN built on proprietary hardware, it’s all commodity gear. But t doesn’t stop there. Once you realize that this hardware gives you all you need the logical step is the software and that’s where Windows and Storage Spaces comes in. Yes sure, this is blasphemy in the storage world, so what? The world used to be flat once Winking smile

There is no singe universal solution. Just good to great ones. Don’t fall for the FUD big storage vendors fling around and don’t be man handled into a solution. The reality is that when you have what works for you, that’s when you’ve got it right. So look into Storage Spaces and all related technologies. Once you wrap your head around the concepts, you start to understand the why an how of it all a lot better.

Whatever the nature and the size of your environments and needs, there are options or even combinations of options out there that can help you. You can buy completely configured building blocks (cluster in a box) or even complete Fast Track solutions based on this concept with Hyper-V and the System Center stack. You can also build the solution your self. Combine this with total VM mobility within the data center in combination with continuous availability. Outside of the data center Hyper–V Replica and application based shared nothing high availability help failover fast with out noticeable or minimal service interruption. All this without the “Kings Ransom” prices you have to pay for proprietary solutions. If you’re not interested yet I bet your business and especially your CFO is! All you got to do is manage risk instead of letting fear rule you.

How crazy am I with in this view? Some people actually asked me that combined with the question if anyone is really considering using this is real life. Well I met a lot of like minded people who are putting into practice here at TechEd. And knowing my Hyper-V, Clustering, Networking and storage very well, discussing these subjects and Storage Spaces with other attendees I’ve had a job/consulting offer just about every day. Not too bad as far as real life interest in this goes (and I remember being ridiculed in 2008 for being an early adaptor of Hyper-V Nerd smile ).

The storage world is in flux sending waves through the way we deliver our services. Surf the waves and enjoy the ride Smile .

DELL PowerEdge VRTX Has Potential Beyond ROBO As a Scale Out File Server Building Block

Posted on June 25, 2013 by workinghardinit

The PowerEdge VRTX

I went to have a chat with Dell at TechEd 2013 Europe in Madrid. The VRTX was launched during DELL Enterprise Forum early June 2013 this concept packs a punch and I encourage you to go look at the VRTX (pronounced as “Vertex”) in more detail here. It’s a very quite setup which can be hooked up to standard power. Pretty energy efficient when you consider the power of the VRTX. And the entire setup surely packs a lot of punch at an attractive price point.

It can serve perfectly for Remote Office / Branch Office (ROBO) deployments but has many more use cases as it’s very versatile. In my humble opinion DELL’s latest form factor could be used for some very nice scale out scenarios. It’s near perfect for a Windows Server 2012 Scale Out File Server (SOFS) building block. While smaller ones can be build using 1Gbps the future just needs 10Gbps networking.

10Gbps, RDMA (iWarp, RoCE)

That’s the first thing I missed and the first thing I was told that would arrive very soon. So I’m very happy with that. With sufficient 10Gbps ports to servers and iWarp or RoCE RDMA capable NICs (there’s cheap enough compared to ordinary 10Gbps cards not to have to leave that capability out) we have all we need to function as powerful building block for the Scale Out File Server model with Windows Server 2012 (R2) where the CSV network becomes the storage network leveraging redirected IO. For this concept look at this picture from a presentation a year ago. SMB 3.0 Multichannel and RDMA make this possible.

While then I drew the SOFS building blocks out of R720 7 & MD1200 hardware, the VRTX could fit in there perfectly!

Storage Options

Today DELL uses their implementation of Clustered PCI Raid for shared storage which is supported since Windows Server 2012. This is great. For the moment it’s a non redundant setup but a redundant one is in the works I’m told. Nice, but think positive, redirected IO (block level) over SMB 3.0 would save our storage IO even today. It would be a very wise and great addition to the capabilities of this building block to add the option & support for Storage Spaces. This would make the Scale Out File Server concept shine with the VRTX.

Why? Well I would give us following benefits in the storage layer and the VHDX format in Hyper-V can take benefits :

Deduplication
Thin Provisioning
Management Delegation
UNMAP
Write Cache
Full benefits of ReFS on storage spaces for data protection
Automatic Data Tiering with commodity SSD (ever cheaper & bigger) and SAS disks perhaps even the Near Line ones (less power & cooling, great capacity)
Potential for JBOD redundancy

Look at that feature set people, in box, delivered by Windows. Sweet! Combine this with 10Gbps networking and DELL has not only a SOFS building block in their port folio, it also offers significant storage features in this package. I for one would like them to do so and not miss out on this opportunity to offer even more capabilities in an attractive price package. Dell could be the very first OEM to grab this new market opportunity by supporting the scale out approach and out maneuver their competitors

Anything Else?

Combine such a building block as described above with their unmatched logistical force for distribution and support this will be a hit a a prime choice for Windows shops. They already have the 10Gbps networking gear & features (DCB) in the PowerConnect 81XX & Force10 S4810 switches. It could be an unbeatable price / capabilities / feature combo that would sell very well.

If we go for SOFS we might need more storage in a single building block with a 4 node cluster. Extensibility might be nice for this. More not just as in capacity but I need to work out the IOPS the available configurations can give us.

TechEd 2013 Revelations for Storage Vendors as the Future of Storage lies With Windows 2012 R2

Posted on June 21, 2013 by workinghardinit

Imagine you’re a storage vendor until a few years ago. Racking in the big money with profit margins unseen by any other hardware in the past decade and living it up in dreams along the Las Vegas Boulevard like there is no tomorrow. To describe your days only a continuous “WEEEEEEEEEEEEEE” will suffice.

Trying to make it through the economic recession with less Ferraris has been tough enough. Then in August 2012 Windows Server 2012 RTMs and introduces Storage Spaces, SMB 3.0 and Hyper-V Replica. You dismiss those as toy solutions while the demos of a few 100.000 to > million IOPS on the cheap with a couple of Windows boxes and some alternative storage configurations pop up left and right. Not even a year later Windows Server 2012 R2 is unveiled and guess what? The picture below is what your future dreams as a storage vendor could start to look like more and more every day while an ice cold voice sends shivers down your spine.

“And I looked, and behold a pale horse: and his name that sat on him was Death, and Hell followed with him.”

OK, the theatrics above got your attention I hope. If Microsoft keeps up this pace traditional OEM storage vendors will need to improve their value offerings. My advice to all OEMs is to embrace SMB3.0 & Storage Spaces. If you’re not going to help and deliver it to your customers, someone else will. Sure it might eat at the profit margins of some of your current offerings. But perhaps those are too expensive for what they need to deliver, but people buy them as there are no alternatives. Or perhaps they just don’t buy anything as the economics are out of whack. Well alternatives have arrived and more than that. This also paves the path for projects that were previously economically unfeasible. So that’s a whole new market to explore. Will the OEM vendors act & do what’s right? I hope so. They have the distribution & support channels already in place. It’s not a treat it’s an opportunity! Change is upon us.

What do we have in front of us today?

Read Cache? We got it, it’s called CSV Cache.
Write cache? We got it, shared SSDs in Storage spaces
Storage Tiering? We got it in Storage Spaces
Extremely great data protection even against bit rot and on the fly repairs of corrupt data without missing a beat. Let me introduce you to ReFS in combination with Storage Spaces now available for clustering & CSVs.
Affordable storage both in capacity and performance … again meet storage spaces.
UNMAP to the storage level. Storage Spaces has this already in Windows Server 2012
Controllers? Are there still SAN vendors not using SAS for storage connectivity between disk bays and controllers?
Host connectivity? RDMA baby. iWarp, RoCE, Infiniband. That PCI 3 slot better move on to 4 if it doesn’t want to melt under the IOPS …
Storage fabric? Hello 10Gbps (and better) at a fraction of the cost of ridiculously expensive Fiber Channel switches and at amazingly better performance.
Easy to provision and manage storage? SMB 3.0 shares.
Scale up & scale out? SMB 3.0 SOFS & the CSV network.
Protection against disk bay failure? Yes Storage Spaces has this & it’s not space inefficient either . Heck some SAN vendors don’t even offer this.
Delegation capabilities of storage administration? Check!
Easy in guest clustering? Yes via SMB3.0 but now also shared VHDX! That’s a biggie people!
Hyper-V Replication = free, cheap, effective and easy
Total VM mobility in the data center so SAN based solutions become less important. We’ve broken out of the storage silo’s …

You can’t seriously mean the “Windoze Server” can replace a custom designed SAN?

Let’s say that it’s true and it isn’t as optimized as a dedicated storage appliance. So what, add another 10 commodity SSD units at the cost of one OEM SSD and make your storage fly. Windows Server 2012 can handle the IOPS, the CPU cycles, memory demands in both capacity and speed together with a network performance that scales beyond what most people needs. I’ve talked about this before in Some Thoughts Buying State Of The Art Storage Solutions Anno 2012. The hardware is a commodity today. What if Windows can and does the software part? That will wake a storage vendor up in the morning!

Whilst not perfect yet, all Microsoft has to do is develop Hyper-V replica further. Together with developing snapshotting & replication capabilities in Storage Spaces this would make for a very cost effective and complete solution for backups & disaster recoveries. Cheaper & cheaper 10Gbps makes this feasible. SAN vendors today have another bonus left, ODX. How long will that last? ASIC you say. Cool gear but at what cost when parallelism & x64 8 core CPUs are the standard and very cheap. My bet is that Microsoft will not stop here but come back to throw some dirt on a part of classic storage world’s coffin in vNext. Listen, I know about the fancy replication mechanisms but in a virtualized data center the mobility of VM over the network is a fact. 10Gbps, 40Gbps, RDMA & Multichannel in SMB 3.0 puts this in our hands. Next to that the application level replication is gaining more and more traction and many apps are providing high availability in a “shared nothing“ fashion (SQL/Exchange with their database availability groups, Hyper-V, R-DFS, …). The need for the storage to provide replication for many scenarios is diminishing. Alternatives are here. Less visible than Microsoft but there are others who know there are better economies to storage http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/ & http://blog.backblaze.com/2011/07/20/petabytes-on-a-budget-v2-0revealing-more-secrets/.

The days when storage vendors offered 85% discounts on hopelessly overpriced storage and still make a killing and a Las Vegas trip are ending. Partners and resellers who just grab 8% of that (and hence benefits from overselling as much a possible) will learn just like with servers and switches they can’t keep milking that cash cow forever. They need to add true and tangible value. I’ve said it before to many VARs have left out the VA for too long now. Hint: the more they state they are not box movers the bigger the risk that they are. True advisors are discussing solutions & designs. We need that money to invest in our dynamic “cloud” like data centers, where the ROI is better. Trust me no one will starve to death because of this, we’ll all still make a living. SANs are not dead. But their role & position is changing. The storage market is in flux right now and I’m very interested in what will happen over the next years.

Am I a consultant trying to sell Windows Server 2012 R2 & System Center? No, I’m a customer. The kind you can’t sell to that easily. It’s my money & livelihood on the line and I demand Windows Server 2012 (R2) solutions to get me the best bang for the buck. Will you deliver them and make money by adding value or do you want to stay in the denial phase? Ladies & Gentleman storage vendors, this is your wake-up call. If you really want to know for whom the bell is tolling, it tolls for thee. There will be a reckoning and either you’ll embrace these new technologies to serve your customers or they’ll get their needs served elsewhere. Banking on customers to be and remain clueless is risky. The interest in Storage Spaces is out there and it’s growing fast. I know several people actively working on solutions & projects.

You like what you see? Sure IOPS are not the end game and a bit of a “simplistic” way to look at storage performance but that goes for all marketing spin from all vendors.

Can anyone ruin this party? Yes Microsoft themselves perhaps, if they focus too much on delivering this technology only to the hosting and cloud providers. If on the other hand they make sure there are feasible, realistic and easy channels to get it into the hands of “on premise” customers all over the globe, it will work. Established OEMs could be that channel but by the looks of it they’re in denial and might cling to the past hoping thing won’t change. That would be a big mistake as embracing this trend will open up new opportunities, not just threaten existing models. The Asia Pacific is just one region that is full of eager businesses with no vested interests in keeping the status quo. Perhaps this is something to consider? And for the record I do buy and use SANs (high-end, mid-market, or simple shared storage). Why? It depends on the needs & the budget. Storage Spaces can help balance those even better.

Is this too risky? No, start small and gain experience with it. It won’t break the bank but might deliver great benefits. And if not .. there are a lot of storage options out there, don’t worry. So go on Winking smile

Working Hard In IT

My view on IT from the trenches

Category Archives: Hyper-V