DVMQ In Windows 8 Hyper-V

VMQ or VMDq

To discuss Dynamic VMQ (DVMQ) we first need to talk about VMQ or VMDq in Intel speak. VMQ lets the physical NIC create unique virtual network queues for each virtual machine (VM) on the host. These are used to pass network packets directly from the hypervisor to the VM. This reduces a lot of overhead CPU core overhead on the host associated with network traffic as it spreads the load over multiple cores. The same sort of CPU overhead you might see with 10Gbps networking on a server that isn’t using RSS (see my previous blog post Know What Receive Side Scaling (RSS) Is For Better Decisions With Windows 8. Under high network traffic one core will hit 100% while the others are idle. This means you‘ll never get more than 3Gbps to 4Gbps of bandwidth out of your 10Gbps card as the CPU is the bottleneck.

VMQ leverages the NIC hardware for sorting, routing & packet filtering of the network packets from an external virtual machine network directly to virtual machines and enables you to use 8gbps or more of your 10Gbps bandwidth.

Now the number of queues isn’t unlimited and are allocated to virtual machines on a first-come, first-served basis. So don’t enable this for machines without heavy network traffic, you’ll just waste queues. It is advised to use it only on those virtual machines with heavy inbound traffic because VMQ is most effective at improving receive-side performance. So use your queues where they make a difference.

If you want to see what VMQ is all about take a look at this video by Intel.

Intel VMDq Explanation

 

The video gives you a nice animated explanation of the benefits. You can think of it as providing the same benefits to the host as RSS does. VMQ also prevents one core being overloaded with interrupts due to high network IO and as such becoming the bottle neck blocking performance. This is important as you might end up buying more servers to handle certain loads due to this. Sure with 1Gbps networking the modern CPUs can handle a very big load but with 10Gbps becoming ever more common this issue is a lot more acute again than it used to be. That’s why you see RSS being enabled by default in Windows 2008 R2.

VMQ Coalescing – The Good & The Bad

There is a  little dark side to VMQ. You’ve successfully relieved the bottleneck on the host for network filtering and sorting but you know have a potential bottle neck where you need a CPU interrupt for every queue. The documentation states as follows:

The network adapter delivers interrupts to the Management Operating system for each VMQ on the processor based processor VMQ affinity. If the interrupts are spread across many processors, the number of interrupts delivered can grow substantially, until the overhead of interrupt handling can outweigh the benefit of using VMQ. To reduce the number of interrupts used, Microsoft has encouraged network adapter manufacturers to design for interrupt coalescing, also called shared interrupts. Using shared interrupts, the network adapter processes multiple queues with the same processor affinity in the same interrupt. This reduces the overall number of interrupts. At the time of this publication, all network adapters that support VMQ also support interrupt coalescing.

Now coalescing works but the configuration and the possible headaches it can give you are material for another blog post.  It can be very tedious and you have to manage every action on your NIC and Virtual Switch configuration like a hawk or you’ll get registry values overwritten, value types changes and such. This leads to all kind of issues, ranging from VMQ coalescing not working to your cluster going down the drain  (worse case). The management of VMQ coalescing seems rather tedious and a such error prone. This is not good. Combine that with the sometime problematic NIC teaming and you get a lot of possible and confusing permutations where things can go wrong. Use it when you can handle the complexity or it might bite you.

Dynamic VMQ For A Better Future

Now bring on Dynamic VMQ (DVMQ). All I know about this is from the Build sessions and I’ll revisit this once I get to test it for real with the beta or release candidate. I really hope this is better documented and doesn’t’ come associated with the issues we’ve had with VMQ Coalescing.  It brings the promise of easy and trouble free VMQ where the load is evenly balanced among the cores and avoids to the burden of to many interrupts. A sort of auto scaling if you like that optimizes queue handling & interrupts.

image

That means it can replace VMQ Coalescing and DVMQ will deal with this potential bottleneck on its own. Due to the issues I’ve had with coalescing I’m looking forward to that one. Take note that you should be able to live migrate virtual machines from host with VMQ capabilities to a host that hasn’t. You do lose the performance benefit but I have no confirmation on this yet and as mentioned I’m waiting for the Beta bits to try it out. It’s like Live Migration between an SR-IOV enabled host and non SR-IOV enabled host, which is confirmed as possible. On that front Microsoft seems to be doing a real good job, better than the competition.

Know What Receive Side Scaling (RSS) Is For Better Decisions With Windows 8

Introduction

As I mentioned in an introduction post Thinking About Windows 8 Server & Hyper-V 3.0 Network Performance there will be a lot of options and design decisions to be made in the networking area, especially with Hyper-V 3.0. When we’ll be discussing DVMQ (see DMVQ In Windows 8 Hyper-V), SR-IOV in Windows 8 (or VMQ/VMDq in Windows 2008 R2) and other network features with their benefits, drawbacks and requirements it helps to know what Receive Side Scaling (RSS) is. Chances are you know it better than the other mentioned optimizations. After all it’s been around longer than VMQ or SR-IOV and it’s beneficial to other workloads than virtualization. So even if you’re a “hardware only for my servers” die hard kind of person you can already be familiar with it. Perhaps you even "dislike” it because when the Scalable Networking Pack came out for Windows  2003 it wasn’t such a trouble free & happy experience. This was due to incompatibilities with a lot of the NIC drivers and it wasn’t fixed very fast. This means the internet is loaded with posts on how to disable RSS & the offload settings on which it depends. This was done to get stability or performance back for application servers like Exchange and others applications or services.

The Case for RSS

But since Windows 2008 these days are over. RSS is a great technology that gets you a lot better usage of out of your network bandwidth and your server. Not using RSS means that you’ll buy extra servers to handle the same workload. That wastes both energy and money. So how does RSS achieve this? Well without RSS all the interrupt from a NIC go to the same CPU/Core in multicore processors (Core 0).  In Task Manager that looks not unlike the picture below:

image

Now for a while the increase in CPU power kept the negative effects at bay for a lot of us in the 1Gbps era. But now, with 10Gbps becoming more common every day, that’s no longer the case. That core will become the bottle neck as that poor logical CPU will be running at 100%, processing as much network interrupts in can handle, while the other logical CPU only have to deal with the other workloads. You might never see more than 3.5Gbps of bandwidth being used if you don’t use RSS. The CPU core just can’t keep up. When you use RSS the processing of those interrupts is distributed across al cores.

With Windows 2008 and Windows 2008 R2 and Windows 8 RSS is enabled by default in the operating system. Your NIC needs to support it and in that case you’ll be able to disable or enable it. Often you’ll get some advanced features (illustrated below) with the better NICs on the market. You’ll be able to set the base processor, the number of processors to use, the number of queues etc. That way you can divide the cores up amongst multiple NICs and/or tie NICs to specific cores.

image

image

So you can get fancy if needed and tweak the settings if needed for multi NIC systems. You can experiment with the best setting for your needs, follow the vendors defaults (Intel for example has different workload profiles for their NICs) or read up on what particular applications require for best performance.

Information On How To Make It Work

For more information on tweaking RSS you take a look at the following document http://msdn.microsoft.com/en-us/windows/hardware/gg463392. It holds a lot more information than just RSS in various scenarios so it’s a useful document for more than just this.

Another good guide is the "Networking Deployment Guide: Deploying High-Speed Networking Features". Those docs are about Windows 2008 R2 but they do have good information on RSS.

If you notice that RSS is correctly configured but it doesn’t seem to work for you it’ might be time to check up on the other adaptor offloads like TCP Checksum Offload, Large Send Offload etc. These also get turned of a lot when trouble shooting performance or reliability issues but RSS depends on them to work. If turned off, this could be the reason RSS is not working for you..

Some Thoughts Buying State Of The Art Storage Solutions Anno 2012

Introduction

I’ve been looking into storage intensively for some time. At first it was reconnaissance. You know, just looking at what exist in software & hardware solutions. At that phase it was pure and only functionality wise as we found our current more traditional SANs a dead end.

After that there was the evaluation of reliability, performance and support. We went hunting for both satisfied and unsatisfied customers, experiences etc.  We also considered whether a a pure software SAN on commodity hardware would do for us or whether we still need specialized hardware or at least the combination of specialized software on vendor certified and support commodity hardware. Yes even if you have been doing things a certain way for a longer time and been successful with is it pays to step back and evaluate if there are better ways of doing it. This prevents tunnel vision and creates awareness of what’s out there that you might have missed.

Then came the job of throwing out vendors who we thought couldn’t deliver what was needed and /or who have solutions that are great but just to expensive. After that came the ones whose culture, reputation was not suited for or compatible with our needs & culture. So that big list became (sort of) a  long list, which eventually became a really short list.

There is a lot of reading thinking, listening, discussing done during these phases but I’m having fun as I like looking at this gear and dreaming of what we could do with it. But there are some things in storage world that I found really annoying and odd.

Scaling Up or Scaling Out with High Availability On My mind

All vendors, even the better ones in our humble opinion, have their strong and weak points. Otherwise they would not all exist. You’ll need to figure out which ones are a good or the best fit for your needs. So when a vendor writes or tells me that his product X is way above others and that those others their product Z only competes with the lower end Y in his portfolio I cringe. Storage is not that simple. On the other hand they sometimes over complicate straightforward functionality or operational needs if they don’t have a great solution for it. Some people in storage really have gotten trivializing the important and complicating the obvious down to an art. No brownie points for them!

One thing is for sure, when working on scalability AND high availability things become rather expensive. It’s a bit like the server world. Scale up versus scale out. Scaling up alone will not do for high availability except at very high costs. Then you have the scalability issue. There is only so much you can get out of one system and the last 20% become very expensive.

So, I admit,  I’m scale out inclined. For one, you can fail over to multiple less expensive systems and if you have an “N+1” scalability model you can cope with the load even when losing a node. On top of that you can and will use this functionality in your normal operations. That means you know how it works and that it will work during a crisis. Work and train in the same manner as you will when the shits hits the fan. It’s the only way you’ll really be able to cope with a crisis. Remember, chances are you won’t excel in a crisis but will fall back to you lowest mastered skill set.

Oh by the way, if you do happen to operate a nuclear power plant or such please feel free to work both fronts for both scalability & reliability and then add some extra layers. Thanks!

Expensive Scale Up Solutions On Yesterday’s Hardware?

I cannot understand what keeps the storage boys back so long when it comes to exploiting modern processing power. Until recently they all stilled lived in the 32 bit world running on hardware I wouldn’t give to the office temp. Now I’d be fine with that if the prices reflected that. But that isn’t the case.

Why did (does) it take ‘m so long to move to x64 bit? That’s been our standard server build since Windows 2003 for crying out loud and our clients have been x64 since the Vista rollout in 2007. It’s 2012 people. Yes that’s the second decade of the 21st century.

What is holding the vendors back from using more cores? Realistically, if you look at what’s available today, it is painful to see that vendors are touting the dual quad core controllers (finally and with their software running x64 bit) as their most recent achievement. Really, dual Quad core, anno 2012? Should I be impressed?

What’s this magic limit of 2 controllers with so many vendors? Did they hard code a 2 in the controller software and lost the source code of that module?

On the other hand what’s the obsession with 4 or more controllers? We’re not all giant cloud providers and please note my ideas on scale out versus scale up earlier.

Why are some spending time and money in ASIC development for controllers? You can have commodity motherboard with for sockets and 8, 10, 12 cores. Just buy them AND use them. Even the ones using commodity hardware (which is the way to go long term due to the fast pace and costs) don’t show that much love for lots of cores. It seems cheap and easy, when you need a processor upgrade or motherboard upgrade. It’s not some small or miniature device where standard form factors won’t work. What is wrong in your controller software that you all seem to be so slow in going that route? You all talk about how advanced, high tech, future tech driven the storage industry is, well prove it. Use the 16 or to 32 cores you can easily have today. Why? Because you can use the processing powers and also because I promise you all one thing: that state of the art newly released SAN of today is the old, obsolete junk we’ll think about replacing in 4 years time so we might not be inclined to spend a fortune on it Winking smile. Especially not when I have to do a fork lift upgrade. Been there, done that and rather not do it again. Which brings us to the next point.

Flexibility, Modularity & Support

If you want to be thrown out of the building you just need to show even the slightest form of forklift upgrade for large or complex SAN environments. Don’t even think about selling me very expensive highly scalable SANs with overrated and bureaucratic support. You know the kind where the response time in a crisis is 1/10 of that of when an ordinary disk fails.

Flexibility & Modularity

Large and complex storage that cost a fortune and need to be ripped out completely and/or where upgrades over its life time are impossible or cost me an arm and a leg are a no go. I need to be able to enhance the solution where it is needed and I must be able to do so without spending vast amounts of money on a system I’ll need to rip out within 18 months. It has more like a perpetual, modular upgrade model where over the years you can enhance and keep using what is still usable .

If that’s not possible and I don’t have too large or complex storage needs, I’d rather buy a cheap but functional SAN. Sure it doesn’t scale as well but at least I can throw it out for a newer one after 3 to 4 years. That means I can it replace long before I hit that the scalability bottleneck because it wasn’t that expensive. Or if I do hit that limit I’ll just buy another cheap one and add it to the mix to distribute the load. Sure that takes some effort but in the end I’m better and cheaper off than with expensive, complex highly scalable solutions.

Support

To be brutally honest some vendors read their own sales brochures too much and drank the cool aid. They think their support processes are second to none and the best in the business. If they really believe that they need to get out into the field an open up their eyes. If they just act like they mean that they’ll soon find out when the money starts talking. It won’t talk to you.

Really some of you have support process that are only excellent and easy in your dreams. I’ll paraphrase a recent remark on this subject about a big vendor “If vendor X their support quality and the level of responsiveness what only 10% of the quality of their hardware buying them would be a no brainer”. Indeed and now that fact it’s a risk factor or even a show stopper.

Look all systems will fail sooner or later. They will. End of story. Sure you might be lucky and never have an issue but that’s just that. We need to design and build for failure. A contract with promises is great for the lawyers. Those things combined with the law are their weapons on their battle field. An SLA is great for managers & the business. These are the tools they need for due diligence and check it off on the list of things to do. It’s CYA to a degree but that is a real aspect of doing business and being a manger. Fair enough. But for us, the guys and gals of ICT who are the boots on the ground, we need rock solid, easy accessible and fast support.  Stuff fails, we design for that, we build for that. We don’t buy promises. We buy results. We don’t want bureaucratic support processes. I’ve seen some where the overhead is worse than the defect and the only aim is to close calls as fast as they can. We want a hot line and an activation code to bring down the best support we can as fast as we can when we need it. That’s what we are willing to pay real good money for. We don’t like a company that sends out evaluation forms after we replaced a failed disk with a replacement to get a good score. Not when that company fails to appropriately interpret a failure that brings the business down and ignores signals from the customer things are not right. Customers don’t forget that, trust me on this one.

And before you think I’m arrogant. I fail as well. I make mistakes, I get sick, etc. That’s why we have colleagues and partners. Perfection is not of this world. So how do I cope with this? The same way as when we designing an IT solution. Acknowledge that fact and work around it. Failure is not an option people, it’s pretty much a certainty.That’s why we make backups of data and why we have backups for people. Shit happens.

The Goon Squad Versus Brothers In Arms

As a customer I never ever want to have to worry about where your interests are. So we pick our partners with care. Don’t be the guy that acts like a gangster in the racketeering business. You know they all dress pseudo upscale to hide the fact they’re crooks. We’re friends, we’re partners. Yeah sure, we’ll do grand things together but I need to lay down the money for their preferred solution that seems to be the same whatever the situation and environment.

Some sales guys can be really nice guys. Exactly how nice tends to depend on the size of your pockets. More specifically the depth of your pockets and how well they are lined with gold coin is important here. One tip, don’t be like that. Look we’re all in business or employed to make money, fair enough, really. But if you want me be your client, fix my needs & concerns first. I don’t care how much more money some vendor partnerships make you or how easy it is to only have to know one storage solution. I’m paying you to help me and you’ll have to make your money in that arena. If you weigh partner kickbacks higher than our needs than I’ll introduce you to the door marked “EXIT”. It’s a one way door. If you do help to address our needs and requirements you’ll make good money.

The best advisors – and I think we have one – are those that remember where the money really comes from and whose references really matter out there. Those guys are our brothers in arms and we all want to come out of the process good, happy and ready to roll.

The Joy

The joy simply is great, modern, functional, reliable, modular, flexible, affordable and just plain awesome storage. What virtualization /Private cloud /Database /Exchange systems engineer would mind getting to work with that. No one, especially not when in case of serious issues the support & responsiveness proves to be rock solid. Now combine that with the Windows 8 & Hyper-V 3.0 goodness coming and I have a big smile on my face.

I’m Attending The MVP Summit 2012

I’ll be attending the MVP Summit 2012 in Redmond from the 27th of February to the 2nd of March. I consider myself very lucky to be able to do so and I’m grateful that my employer is helping me make use of this opportunity. I appreciate that enormously.

So my hotel is booked, my flights are scheduled. It’s a long flight with a lengthy stopover in Heathrow. I actually spend a day travelling to and from the event. But I’m told it’s very much worth the effort Smile and I got some great tips from some veteran MVPs in the community. For newbies who can use some information on the MVP summit take a look at What to expect at your first MVP Summit by Pat Richard.

image

I also look forward to meeting so many peers in person and attending all the briefings where we’ll learn a lot of valuable new things about Hyper-V. I cannot talk about them as they are under NDA but there will be many opportunities to provide feedback to the product teams in my expertise “Virtual Machine’” (i.e. Hyper-V). I have a bunch of questions & feedback for the product teams.

From my colleagues I have learned it’s also a good way to help pass feedback from others on to Microsoft. So this is your chance. Take it! What do you like or need in the virtualization products. What should be enhanced and what is hurting you? What works and what doesn’t?  What is missing?

With Windows 8 going into beta by next month don’t expect immediate actions and changes based on your feedback. But if you want to have your opinions taken into consideration you have to let them be heard. So don’t be shy now! Let me know. Sincere & real concerns, along with problems, challenges and feedback on your experiences with the product are very much appreciated.

So, if you have any remarks, feedback, feature requests you’d like to share with the virtualization product teams let me know. Just post them in the comments, send me a e-mail via the contact form or message me via @workinghardinit on twitter. My colleagues tell me the program managers in the virtualization area are a very communicative and responsive bunch. I think that’s true from my experiences with them in the past.