Hyper-V Technical Preview Live Migration & Changing Static Memory Size

I have played with  Hot Add & Remove Static Memory in a Hyper-V vNext Virtual Machine before and I love it. As I’m currently testing (actually sometimes abusing) the Technical Preview a bit to see what breaks I’m sometimes testing silly things. This is one of them.

I took a Technical Preview VM with 45GB of memory, running in a Technical Preview Hyper-V cluster and live migrate it.  I then tried to change the memory size up and down during live migration to see what happens, or at least nothing goes “BOINK”. Well, not much, we get a notification that we’re being silly. So no failed migrations, crashed or messed up VMs or, even worse hosts.

image

It’s early days yet but we’re getting a head start as there is a lot to test and that will only increase. The aim is to get a good understanding of the features, the capabilities and the behavior to make sure we can leverage our existing infrastructures and software assurance benefits as fast a possible. Rolling cluster upgrades should certainly help us do that faster, with more ease and less risk. What are your plans for vNext? Are you getting a feeling for it yet or waiting for a more recent test version?

A MVP once more in 2015 – happy New Year from a renewed MVP

Happy New Year people! May 2015 bring you happiness,  good health,  and good jobs/projects/customers with real opportunities for growth & advancement. Don’t forget to step out of the office, away from the consoles once in a while to enjoy the wonderful experiences and majestic views this world has to offer.

image

Being an January 1st MVP (my expertise is Hyper-V) means that every year on new years day I might get an e-mail to inform me I have been renewed, or not. Prior to that our MVP lead will contact us to make sure we have updated our community activities and they’ll decide on whether we’re MVP material, or not.  Today I received this e-mail awarding me the MVP award for 2015.

image

It remains a special feeling to receive the award.  It’s recognition for what you’ve done and it means that I can enjoy the benefits that come with it: the MVP Global Summit and the interaction with the product groups at Microsoft. The summit is very valuable to me and if I knew the dates I would already book my flights and the hotel right now.

Some people think it makes us fan boys but I can assure you that’s not the case. Microsoft hears the great, the good, the bad and the ugly from us. And yes, they appreciate that as they cannot and do not want to live in an Ivory tower. So they need feedback and we’re a part of the feedback loop. We MVPs are a good mix of customers, consultants, partners & businesses working with their technologies & helping out the community to make the best use of them. Microsoft puts it like this:

“The Microsoft Most Valuable Professional (MVP) Award is our way of saying thank you to exceptional, independent community leaders who share their passion, technical expertise, and real-world knowledge of Microsoft products with others.”

The fact that we are independent is an important factor here. It makes us a valuable resource pool of hands on experience to mix in with other feedback channels. As Aidan Finn wrote in his blog post, Feedback Matters Once Again In Microsoft, it does indeed matter. Again? It always did but they listen more and better now Winking smile. They don’t need an "echo chamber" they value opinions, insights and experiences. The MVP award is for the things you’ve done and do. Sure, there is a code of conduct but that doesn’t mean you cannot voice your concerns. "Independent" means that what we say doesn’t have to be sugar coated marketing. Our value is in the fact that we help out the community (their customers, partners and Microsoft itself) in the better use and development of their solutions base on our real world experiences. Microsoft discusses that here.

It opens up doors and creates opportunities, and for that I’m grateful as well. For my employers/customers it means that when you hire me you get access to not just my skills and expertise but to the collective knowledge and experience of a global network of passionate experts that have a proven track record of engagement and are recognized internationally for that. Not too shabby is it Winking smile.

SMB Direct With RoCE in a Mixed Switches Environment

I’ve been setting up a number of Hyper-V clusters with  Mellanox ConnectX3 Pro dual port 10Gbps Ethernet cards. These Mellanox cards provide a nice amount of queues (128) for DVMQ and also give us RDMA/SMB Direct capabilities for CSV & live migration traffic.

Mixed Switches Environments

Now RoCE and DCB is a learning curve for all of us and not for the faint of heart. DCB configuration is non trivial, certainly not across multiple hops and different switches. Some say it’s to be avoided or can’t be done.

You can only get away with a single pair of (uniform) switches in smaller deployments. On top of that I’m seeing more and more different types of switches being used to optimize value, so it’s not just a lab exercise to do this. Combine this with the fact that DCB is an unavoidable technology in networking, unless it get’s replaced with something better and easier, and you might as well try and learn. So I did.

Well right now I’m successfully seeing RoCE traffic going across cluster nodes spread over different racks in different rows at excellent speeds. The core switches are DELL Force10 S4810 and the rack switches are PowerConnect 8132Fs. By borrowing an approach from spine/leave designs this setup delivers bandwidth where they need it a a price point they can afford. They don’t need more expensive switches for the rack or the core as these do support DCB and give the port count needed at the best price point.  This isn’t supposed to be the top in non blocking network design. Nope but what’s available & affordable today in you hands is better than perfection tomorrow. On top of that this is a functional learning experience for all involved.

We see some pause frames being sent once in a while and this doesn’t impact speed that very much. It does guarantee lossless traffic which is what we need for RoCE. When we live migrate 300GB worth of memory across the nodes in the different racks we get great results. It varies a bit depending on the load the switches & switch ports are under but that’s to be expected.

Now tests have shown us that we can live migrate just as fast with non RDMA 10Gbps as we can with RDMA leveraging “only” Multichannel. So why even bother? The name of the game low latency and preserving CPU cycles for SQL Server or storage traffic over SMB3. Why? We can just buy more CPUs/Cores. Great, easy & fast right? But then with SQL licensing comes into play and it becomes very expensive. Also storage scenarios under heavy load are not where you want to drop packets.

Will this matter in your environment? Great question! It depends on your environment. Sometimes RDMA is needed/warranted, sometimes it isn’t. But the Mellanox cards are price competitive and why not test and learn right? That’s time well spent and prepares you for the future.

But what if it goes wrong … ah well if the nodes fail to connect over RDAM you still have Multichannel and if the DCB stuff turns out not to be what you need or can handle, turn it of and you’ll be good.

RoCE stuff to test: Routing

Some claim it can’t be done reliably. But hey they said that for non uniform switch environments too Winking smile. So will it all fall apart and will we need to standardize on iWarp in the future?  Maybe, but isn’t DCB the technology used for lossless, high performance environments (FCoE but also iSCSI) so why would not iWarp not need it. Sure it works without it quite well. So does iSCSI right, up to a point? I see these comments a lot more form virtualization admins that have a hard time doing DCB (I’m one so I do sympathize) than I see it from hard core network engineers. As I have RoCE cards and they have become routable now with the latest firmware and drivers I’d love to try and see if I can make RoCE v2 or Routable RoCE work over different types of switches but unless some one is going to sponsor the hardware I can’t even start doing that. Anyway, lossless is the name of the game whether it’s iWarp or RoCE. Who know what we’ll be doing in 5 years? 100Gbps iWarp & iSCSI both covered by DCB vNext while FC, FCoE, Infiniband & RoCE have fallen into oblivion? We’ll see.