Shared Nothing Live Migration White Board Time – Scenario I

The Problem

Let’s say you are very happy with your SAN. You just love the snapshots, the thin provisioning, deduplication, automatic storage tiering, replication, ODX and the SMI-S support. Live is good! But you have one annoying issue. For example; to get the really crazy IOPS for your SQL Server 2012 DAG nodes you would have to buy 72 SSDs to add to you tier 1 storage in that SAN. That’s a lot of money I you know the price range of those. But perhaps you don’t even have SSDs in your SAN.To get the required amount of IOPS from your SAN with SAS or NL-SAS disks in second and respectively third level storage tier you would need to buy a ridiculous amount of disks and, let’s face it, waste that capacity. Per IOPS that becomes a very expensive and unrealistic option.

Some SSD only SAN vendors will happily sell you a SAN that address the high IOPS need to help out with that problem. After all that is their niche, their unique selling point, fixing IOPS bottle necks of the big storage vendors where and when needed. This is cheaper solution per IOPS than you standard SAN can deliver but it’s still a lot of money, especially if you need more than a couple of terabytes of storage. Granted they might give you some extra SAN functionality you are used to, but you might not need that.

Yes I know there are people who say that when you have such needs you also have the matching budgets. Maybe, but what if you don’t? Or what if you do but you can put 500.000 € towards another need or goal? Your competitive advantage for pricing your products and winning customers might come form that budget Winking smile

Creative Thinking or Nuts?

Let’s see if we can come up with a home grown solution bases on Windows Server 2012 Hyper-V. If we can this might solve your business need, save a ton of money and extend  (or even save) the usefulness of you SAN in your environment. The latter is possible because you successfully eliminated the biggest disk IO from you SAN.

The Solution Scenario

So let’s build 3 Hyper-V hosts, non-clustered, each with its own local SAS based storage with commodity SSD drives. You can use either storage pools/spaces with a non-raid SAS HBA or use a RAID SAS HBA with controller based virtual disks for this. If you’ve seen what Microsoft achieved with this during demos you know you can easily get to hundreds of thousands of IOPS. Let’s say you achieve half of what MSFT did in both IOPS and latency. Let’s just put a number on it => that’s about 500.000 IOPS and 5GB/s. Now reduce that for overhead of virtualization, the position of the moon and the fact things turn out a bit less than expected. So let’s settle for 250.000 IOPS and 2.5GB/s. Anybody here who knows what this kind of numbers would cost you with the big storage vendors their SANs? Right, case closed. Don’t just look at the cost, put it into context and look at the value here. What does and can your SAN do and at what cost?

OK we lose some performance due to the virtualization overhead. But let’s face it. We can use SR-IOB to get the very best network performance. We have hundreds of thousands of IOPS. All the cores on the hosts are dedicated to a single virtual machine running a SQL Server DAG node and bar 4Gb of RAM for the OS we can give all the RAM in the hosts to the VM. This one VM to one host mapping delivers a tremendous amount of CPU, Memory, Network and Storage capabilities to your SQL Server. This is because it gets exclusive use of the resources on the host, bar those that the host requires to function properly.

In this scenario it is the DAG that provides high availability to the SQL Server database. So we do not mind loosing shared storage here.

image

Because we have virtualized the SQL server you can leverage Shared Nothing Live Migration to move the virtual machines with SQL server to the central storage of the SAN without down time if the horsepower is no longer needed. That means that you might migrate another application to those standalone Hyper-V hosts That could be high disk IO intensive application, that is perhaps load balanced in some way so you can have multiple virtual machines mapped to the hosts (1 to 1, many to one). You could even automate this all and use the “Beast” as a dynamic resource based on temporal demands.

In the case of the SQL Server DAG you might opt to keep one DAG member on the SAN so it can be replicated and backed up via snapshot or whatever technology you are leveraging on that storage.

Extend to Other Use Cases

More scenarios are possible. You could build such a beast to be a Scale Out File Server or PCI RAID/Shared SAS if you need shared storage to build a Hyper-V cluster when your apps require it for high availability.

image

The latter looks a lot like a cluster in a box actually. I don’t think we’ll see a lot iSCSI in cluster in a box scenarios, SAS might be the big winner here up to 4 nodes (without a “SAS switch”, which brings even “bigger” scenarios to live with zoning,  high availability, active cables and up to 24Gbps of bandwidth per port).

Using a SOFS means that if you also use SMB 3.0 support with your central SAN you can leverage RDMA for shared nothing live migration, which could help out with potentially very large VHDs of your virtual SQL Servers.

Please note that the big game changer here compared to previous versions of windows is Shared Nothing Live Migration. This means that now you have virtual machine mobility. High performance storage and the right connectivity (10Gbps, Teaming, possibly RDMA if using SMB 3.0 as source and target storage) means we no longer mind that much to have separate storage silos. This opens up new possibilities for alleviating IOPS issues. Just size this example to your scenarios & needs to think about what it can do for you.

Disclaimer: This is white board thinking & design, not a formal solution. But I’d cannot ignore the potential and possibilities. And for the critics. No this doesn’t mean that I’m saying modern SANs don’t have a place anymore. Far from it, they solve a lot of needs in a lot of scenarios, but they do have some (very expensive) pain points.

Hyper-V Shared Nothing Live Migration In Windows Server 2012– VM Mobility Rules

I see and hear some people shrug at the idea of Shared Nothing Live Migration, dismissing it as marginally useful. Some do state they’ll have it as well but that it’s not that valuable. Well I disagree totally. A lot of the time these remarks are due to a lack of understanding about how several technologies in the Microsoft stack work together. Combine this with tunnel vision and the fear of some vendors and you get a lot of FUD.

I advise you to look beyond the virtualization stack, to the issues that people who are building infrastructure for dynamic, flexible and * cloud  data centers are dealing with.

Look, as “architects” we have to design & build for failure. We all know that it’s just a matter of time before things go BOINK.  So we build in redundancy, some of this within a silo, some of this is between silos. The two approaches compliment each other. What this gives you is options and everybody who knows me, especially those who work  with me has heard my mantras: “Assumptions are the mother of all F* Ups” and “Options, options, options”. Make sure you design & build in options. This way you can maneuver your self out of a bad situation. Don’t ever assume you’re out of options, especially not when you put some in the design on purpose Winking smile. It’s also very useful beyond that because a lot of you might agree with me that silos and fork lift, down time inducing upgrades, migrations, transitions or replacements are expensive and bad. This is where Share Nothing Live Migrations comes into play. You gain mobility over silos. That silo might be a server, a cluster, storage or mixtures of them all.

With Shared Nothing Live Migration we can migrate virtual machines between those silos with nothing more than a network cable.This is huge people. You are no longer trapped in that silo. In this context it provides you with all the options & flexibility mobility gives you. even it the technology itself is not about high availability.

Some very useful scenarios

Migrate virtual machines from an old cluster to a new cluster with out any down time

  1. Migrate virtual machines from stand alone hyper-V hosts to a fail over cluster with out any down time
  2. Migrate virtual machines from one stand alone host to another one for maintenance, again, without any down time
  3. Choose different types if storage & Hyper-V deployment depending in IOPS, redundancy, availability, manageability needs. With Shared Nothing Live Migration you can be confident  that  you can move your virtual machine from one environment to the other when needs change. This is breaking the storage silo boundaries open people! This is huge … think about it.

How it works

The details are for another post but basically is made possible by the combination of Live Storage Migration and Live Migration.

First the Storage is Live Migrated

image

After the Live Storage Migration is done the state of the virtual machines is copied and synchronized.

image

This Is Mobility

I hear the competition shrug.  It isn’t high availability. Well indeed no one who understands the feature ever said it was. It’s virtual machine mobility. Look at the scenarios above and you’ll see that this ability could very well be game changer in how we look at storage & design solutions.

Speed & Performance

What did we hear on this front: “it will be too slow to be really useful”. Really? Well let’s see:

  1. The world is converging to 10Gbps and after that 40Gbps and up will come
  2. NIC Teaming in box With Windows 2012 which can provide more bandwidth.
  3. SMB 3.0 Multichannel. This provides multiple channels per connection spreading the load over multiple CPUs
  4. SMB Direct, have you seen the speeds this achieves?

Before you state that this doesn’t work on Live Migration … as confirmed at TechEd 2012 Europe with Jose Baretto this does work when both the source AND the target is an SMB 3.0 share. This means yet another reason to use SMB 3.0 share for your Hyper-V storage needs! So unlike what Tad at vLimited keeps saying, unhindered by any knowledge, it is a very valuable feature and it can be extremely fast given the right connectivity and storage that can handle the IOPS. And no, the fact that it’s unbuffered doesn’t impact this to much. Test this by using xcopy/robocopy /J with a VHD over your infrastructure.

image

Even if you’re on a budget and cannot go for the RDMA NICs & SMB 3.0 you have several options to get very decent virtual machine mobility and not be stuck in a silo. And for those who want to leverage this feature to create and agile & mobile virtual environment you have some very nice technologies available to optimize to your needs & budgets.

Conclusion

Virtual Machine mobility and storage mobility are very interesting features that provide for a previously unknown flexibility. Windows Server 2012 makes us rethink our storage approaches (I sure am) and I’m very interested in seeing how this will evolve.

Very Educational Microsoft TechEd 2012

Hello from TechEd 2012 Europe at the RAI in Amsterdam. I’ve been extremely busy attending sessions, talking to Microsoft employees and vendor engineers. We’ve had some very interesting discussions and I learned a lot and clarified even more. TechEd has once more proven to be an excellent investment of time and I have been able to get a lot of face time with the right people. To me this is important because that helps me tremendously when designing solutions. Sorry for the low quality pictures.

Photo0019

Bob Combs on stage at the TechEd Europe 2012  educating us on NIC Teaming

You know my mantra “options, options options” as this is what gets you a way out of a pinch. However a lot of options also mean you need to make decisions, and not just when dealing with issues but also at design time. Knowledge and understanding is what help to make the correct of the best decision fast. Attending this conference with its tremendous networking opportunities provides for a very nice and effective setting for passionate discussions and deep dives into scenarios. Challenging vendors, interacting with peers, throwing ideas out there and deep diving into the possibilities and drawbacks with each other is great and helps a lot to understand technologies better. You have to thrown what you have learned out there and discuss it to test your understanding of the subject. Don’t be afraid to do so. We all don’t know things, get stuff wrong, etc. Don’t let  fear stop you from interacting with your peers.

Photo0020

Ben Armstrong in action on Live Storage Migration

It is also great to meet up with my community buddies from all over the word again and I feel privileged to have the opportunity to attend these conferences. For me personally these are priceless and the value to my employers/clients is considerable. There is a tsunami of new technology in the Windows Server 2012 stack and learning to put these into context is both fun and useful. These are very interesting times in the Microsoft Infrastructure ecosystem so life is good!

Answer to Brad at TechEd Europe 2012 Keynote: Pessimists & Tad Don’t like Windows Server 2012

Brad is on stage for the opening keynote asking if the glass is half full or half empty. Well it depends on where you are in the ecosystem. For us the glass is half full and filling up fast.

Some people nag me about the fact that Windows Server 2012 is so different and that it’s wrong to turn the world upside down. Yes, it is different and new in many ways.  There are also many improvements to features that already exist. There is a lot to learn and understand. Why are some people so pessimistic?

Ever since I got my hands on the BUILD Developer Preview bits I have personally invested a lot of my time in Windows Server 2012. With the beta that only increased. Why? Well, that’s the way forward, because that’s where the improvements are. We can’t do tomorrows jobs and meet tomorrows demands with yesterdays technology.

pessimistsbanner

The picture above is basically the pessimists view of the world. Enjoy your cupper but I’m not joining you. Windows Server 2012 rocks and it’s going to do a whole lot for our industry and businesses. But wait a minute, I do understand why Tad is so pessimistic. But that’s about the future of vLimited and being stuck in the past. Listen Tad, you’d better empty that cup because this is where vLimited becomes history rather than write it.

Does that mean I’ll be throwing away Windows 2008 R2? Nope. I expect to deal a lot with it in the next few years but I’m not going to build future infrastructure on the previous version. I will introduce Windows Server 2012 where and when we benefit from it. For me that is from day one the bits RTM. The benefits are so overwhelming we’d hurt ourselves by not doing it. Your mileage may vary. But don’t get stuck in the past  Here’s a link to your escape pod: Microsoft Virtual Machine Converter Solution Accelerator I’m happy it’s here. That’s what people are asking me more and more about, how to move to Hyper-V.

But what’s with the negativism of some? Sure people are still running Windows Server 2000/2003. Sometimes for good reasons, often for (very) bad ones. Are some going to go through all this again with people clinging to Windows2008 R2? No doubt. Been there, seen it. Very predictable. Is Windows Server 2012 going to fail?  No way.  And what I’m seeing in Windows Server 2012 is great technology. Will it be perfect? No. I already have feature requests for vNext Smile. But this is pushing the ball forward, this is ambitious in the best sense of that word.  There will be bugs, there will be challenges and hiccups. That’s part of the business and the realities of life.  But look at all what’s available in there. Don’t just read some industry press articles. Did you test it your self already? Did you do any clustering? Tested all the new functionality in Hyper-V? The innovations in Live Migration options and networking? Looked at the amount of PowerShell support in there? Notice the improvements in Active Directory, DHCP and other core infrastructure services? Have you used Windows Server 2012 at all yet? You didn’t look at SMB 3.0 and all the storage improvements in there did you? Go talk to Jeff Woolsey, he’s passionate about it and for good reasons. Put in some effort, live a little, get out of your comfort zone and you’ll be going places. Don’t be a pessimist. Think positive or you’ll end up like Tad who was the joke of the party at MMS2012

image