Full Steam Ahead With Windows 8 & Hyper-V in 2012

Some History

There have been a good number of people who’ve always used, some a lot more and some others a lot less, a bit of Microsoft bashing to gain some extra credibility or try to position other products as superior. Sometimes this addressed, at least, some real challenges and issues with Microsoft products. A lot of the time it doesn’t. I have always found this ridiculous. In the early years of this century I was told to get out of the Microsoft stack and into the LAMP stack to make sure I still had a job in a few years’ time. My reaction was to buy Inside SQL Server 2000 among other technology books Smile. The paradox is that in some cases, like some storage integrators, is that the ones doing the bashing are forgetting that their customers are often heavily invested in the Microsoft stack.

I Still Have A Job

As you might have realized already, I still have a job today. I’m very busy, building more and better environments based on Microsoft technologies. Microsoft does not get everything right. Who does? Sometimes it takes more than a few tries, sometimes they fail. But they also succeed in a lot of their endeavors.They are capable to learn, adapt and provide outstanding results with a very good support system to boot (I would dare say that you get out of that what you put into it). Given the size and nature of the company, combined with IT evolving at the speed of light, that’s not an easy task.

Today that ability translates into the upcoming release of Windows 8. Things like Hyper-V 3.0, the new storage and networking features, the improvements to clustering and the file system are the current state an evolution. A path along Windows 2000 over Windows 2003(R2), to  the milestone Windows 2008 which was improved with Windows 2008 R2. Now, Windows 8 being the next generation improves vastly on that very good and solid foundation. With Windows 8 we’ll take the next step forward in building highly scalable, highly available, feature rich a very functional solutions in a very cost effective manner. On top of that we can do more now than ever before, with less complexity and with affordable  standard hardware. If you have a bigger budget, great, Windows 8 will deliver even more and better bang for the buck if and when your hardware vendors get on the band wagon.

Windows 8 & Storage

One of the things the Windows BUILD Conference achieved is that it wanted me to buy hardware that I couldn’t get yet. Just try asking DELL or HP for RDMA support on 10Gbps and you get a bit of a vacant blank stare.

Another thing is that it made me look at our storage roadmap again. One of the few sectors in IT that are still very expensive is storage. Some of the storage vendors might start to feel a bit like a major network gear vendor. You know the one that has also seen the effects of serious competition by high quality but lower cost kit. Just think about what Storage Pools/Spaces will do for affordable, easy to use and rich storage solutions. Both with standard over the shelf available (read affordable) hardware and with modern SANs that leverage the Windows 8 features there is value. Heath my warning storage vendors. You’re struggling in the SMB market due to complexity, cost and way to much overhead and expensive services. Well it’s only going to get worse. You’ll have to come with better proposals or you’ll end up being high end / niche market players in the future. Let’s face it, if I can buy a super micro chassis with the disks of my choosing I can build my own storage solution for cheap and use Windows 8 to achieve my storage needs. Perhaps is 80/20 but hey, that’s great. It’s not that much better with more expensive solutions (vendor disks are ridiculously over priced) and the support process is sometimes a drain on your workforce’s time and motivation. And yes you paid for that. Compare this with being able to buy some spare parts on the cheap and having it all available of the shelf with the vendors. No more calls, no more bureaucratic mess for return parts, nor more IT illiterate operators to work through before you reach support that can be sub standard as well. Once you reach a certain level of hardware quality there is not that much difference any more except for price and service. Granted, some vendors are better at this then others. The really big ones often struggle getting this right.

I’ve been in this business long enough to know that all stuff breaks. SLAs are fine for lawyers and for management. CYA is part of doing business. But for the IT Pro in the field you need reliable people, gear and services.  On top of that you have to design for failure. You know things will break. So it should be a cheap, easy and fast as possible to fix while your design and architecture should cope with the effects of a failure. That’s what IT Pros need and that what’s keeps things running (not that SLA paper in the mailbox of your manager).

Show the Windows customers a bit more love than you have done in the past. Some in the storage industry tend to like to look down on the Windows OS. But guess what, it is your largest customer base. Unless you want to end up in the same niche as a very expensive personal trainer for Hollywood stars (tip: there’s not a huge job market there) you’d better adjust to new realities. A lot of them are doing that already , some of them aren’t. To those: get over it and leverage the features in Windows 8. You’ll be able to sell to a more varied public and at the high end you’ll have even better solutions to offer. Today I notice way to many storage integrators who haven’t even looked at Windows 8. It’s about time they started … really, like today. I mean how do you want to sell me storage today if you can’t answer my queries on Windows 8 & System Center 2012 support and integration? To me this is huge! I want to know about ODX, RDMA, SMI-S and yes I want you to be able to answer me how your storage deals with CSVs. You should know about the consumption of persistent ISCSI-3 reservations and a rock solid hardware VSS provider. If you can do that it creates the warm fuzzy feeling a customers need to make that leap of faith.

When I look at the network improvements in Windows 8. Things like RDMA, SMB 2.2; File Transfer Offload and what that means for file sharing and data intensive environments I’m pretty impressed. Then there is Hyper-V 3.0 and it many improvements. Only a fool would deny that it is a very good, affordable & rich hypervisor with a bright future as far as hypervisors go (they are not the goal, just a means to an end). Live Storage Migration, an extensible virtual switch, monitoring of the virtual switch, Network Virtualization, Hyper-V Replica, … it’s just too much to mention here. But hop on over to Windows 8 Hyper-V Feature Glossary by Aidan Finn. He’s got a nice list up of the new features relevant to the Hyper-V crowd. Again, we see improvements for all business sizes, from SMB to enterprise, including the ISPs and Cloud providers. Windows 8 is breaking down barriers that would interdict it’s use in various environments and scenarios. Objections based on missing features, scalability, performance or security in multi tenancy environments are being wiped of the map. If you want to see some musing on this subject just look at Group Video Interview: What is your favorite Hyper-V feature in Windows 8?.

2012 & Beyond

Hyper-V is growing. It’s already won a lot of hearts and minds of many smaller Microsoft shops but it’s also growing in the enterprise. The hybrid world is here when you look at the numbers, even if it’s not yet the case in your neck of the woods. Why? Cost versus features. Good enough is good enough. Especially when that good is rather great. On top of that the integration is top notch and it won’t cost you a fortune and save you a lot of plumbing hassle.

Basically everyone can benefit from all this. You’ll get more and better at a lesser or at least a more affordable cost. Even if you don’t use any Microsoft technologies you’ll benefit from the increased competition. So everyone can be happy.

I’m an MVP–What a Great Start Of 2012

Microsoft presented me with the 2012 Microsoft® MVP Award under the Virtual Machine expertise. If you’d like to know a bit more about the MVP Program and the Award you can take a look here http://mvp.support.microsoft.com/gp/aboutmvp

This is special to me, and I’m honored by it. It’s very nice to get such recognition both from your peers in the community and from Microsoft for sharing your experiences and knowledge for the better good. This doesn’t mean I’m an "know all, end all" guru, far from it. No one knows everything or never makes mistakes. To me it does mean my peers think highly enough of me so that they are willing to nominate me and serve as a reference for my skill set and contributions. That by itself is a huge compliment but I’m grateful to have the opportunities to learn a lot and for that I owe some thanks. I learn a lot from participating in a world wide community that shares experiences & knowledge. The amount of skills that these people bring to the table and the wealth of information that is shared by all is enormous. ”The community” is a varied group of experts in their own areas of excellence.

  • Some are (sometimes long time) MVPs like Aidan Finn, Hans Vredevoort, Jaap Wesselius, Jetze Mellema, Kurt Roggen, Mike Resseler, Kristian Nese, Carsten Rachfahl.
  • Naturally there are the Microsoft employees, both locally and abroad, with whom I’ve had the pleasure of working on support & business cases and who’ve probably vouched for me when asked to do so.
  • Then there is the interaction with community members like Ronnie Isherwood, Jeff Wouters, Dave Stork, Peter Noorderijk, Maarten Wijsman, Rick Slager and my blog readers , and a lot of the  people who follow me on twitter (Ronny Pot, J. Wolfgang Goerlich, Kevin Ball, Kenneth, …) and so many other I’m probably forgetting to mention Embarrassed smile. Some of these I’ve had the privilege of meeting in real life and those occasions have always been both educational & fun. Sometimes these meetings turned into an international distributed testing/troubleshooting effort where we all learn something like at TEC 2011.
  • On top of that I have the luck to work with some really nice people both colleagues (Tom, Peter, Karel, Ivan, Sabrina, Jeff – you rock – and thanks for sticking with us through all the sometimes challenging projects). Some are consultants and people I know at other companies that work for or with us.

Together we learn a lot through the need to answer sometimes complex questions and find solutions for the problems at hand. This makes for a great learning school and ongoing education until that day arrives you’re recognized as an expert while you realize more and more how much there is to learn.

Experts2Experts Conference London (UK) 2011

I’m at the Experts2Experts Conference in London and I’m having a great time talking shop, tech & business with my fellow IT Pro colleagues from around Europe. Aidan Finn, Jeff Wouters, Carsten Rachfahl, Ronnie Isherwood.

It might be fun for Microsoft to join us for some of these lunch & dinner time dicussions. It would provide them with great feedback, ideas, concerns. Very educational. While we’re discussing Citrix, VMware, Microsoft & ISV solutions (RES, Appsense) this is not a vendor centric conference. Sure we all work with these products but we’re discussing it from our point of view. The challenges, the issues, the successes & failures are discussed and mentioned.

There’s a high density of virtualization, private cloud, desktop virtualization (VDI, Terminal Servers, Application Virtualization, Client hosted virtual desktops etc.) expertise at the conference to make it interesting.

Tomorrow I’ll be sharing some musings on “High Performance & High availability Networks for Hyper-V Clusters” during my session.

System Center Virtual Machine Manager 2008 R2 Error 12711 & The cluster group could not be found (0×1395)

The Issues

I recently had to go and fix some issues with a couple of virtual machines in SCVMM 2008 R2. There was one that failed to live migrate with following error:

Error (12711)
VMM cannot complete the WMI operation on server HopelessVm.test.lab because of
error: [MSCluster_ResourceGroup.Name=" df43bf60-7216-47ed-9560-7561d24c7dc8"] The cluster group could not be found.

(The cluster group could not be found (0×1395))
 
Recommended Action
Resolve the issue and then try the operation again

Other than that it looked fine and could be managed with SCVMM 2008 R2. Another one was totally wrecked it seemed. It was in a failed state after an attempted live migration. You couldn’t do anything with it anymore. Repair was “available” but every option there failed so basically that was the end of the game with that VM. Both issues can be resolved with the approach I’ll describe below.

The Cause

After some investigation the cause of this was the fact that this virtual machine had been removed from the failover cluster as a resource was exported & imported using Hyper-V manager on one of the cluster nodes. It was then added back to the failover cluster again to make them high available. All this was done without removing it from SCVMM 2008 R2. By the way, as mentioned above in “The Issues” this can get even worse than just failing live migrations. The same scenario can lead to virtual machines going into a failed state that you can’t repair (retry or undo fail) or ignore and basically you’re stuck at that point. You can’t even stop, start, shutdown the virtual machine anymore, not one single operation works in SCVMM while in the failover cluster GUI and in hyper-v manager everything is fully operational. This is important to note, as the services are fully on line and functional. It’s just in SCVMM that you’re in trouble.

Why did they do it this way? They did it to move the VM to a new CSV. The fact that you delete the VM files when deleting a VM with SCVmm2008R2 made them use Hyper-V manager instead. Now this approach (whatever you think of it) can work but then you need to delete the VM in SCVMM2008R2 after exporting the virtual machine AND before proceeding with the import and making the virtual machine highly available.

People get creative in how to achieve things due to inconsistencies, differences in functionality between Hyper-V Manger and SCVMM 2008R2 (in the latter especially the lack of complete control over naming, files & folders, export/migration behavior) as well as the needs of the failover cluster can lead to some confusing scenarios.

The Supported Fix

Now the easy way to fix this is to export the virtual machine again and delete it in SCVMM 2008 R2. That will remove the virtual machine object from SCVMM, the failover cluster en Virtual Machine Manager. However this virtual machine was so large (50Gb + 750 GB data disk) that there was no room for an export to be made. Secondly an export of such a large VM takes a considerable time and it has to be off line for this operation. This is annoying as SCVMM might be uncooperative at this point, the virtual machine is online en performing it’s duties for the business. So this presented us with a bit of a problem. Stopping the virtual machine, Exporting it using Hyper-V Manager will cause it to go missing in SCVMM 2012 and then you can delete it, importing the virtual machine again and adding it to the failover cluster causes down time.

The Root Cause

Why does this happen? Well when you import a virtual machine into a failover cluster is creates a new unique ID for the virtual machine Resource Group . This happens always. Choosing to reuse an existing ID during import in Hyper-V Manager has nothing to do with this. But VMM uses ID/names to identify a VM, independent of the cluster. So when you did not remove the VM from SCVMM before adding the VM back to the cluster you get a different cluster group ID in the cluster than you have in SCVMM. They both have the same name but there is a disconnect leading to the issues described above.

By the way exporting & importing a VM without first removing the virtual machine from the failover cluster leads to some issues in the Failover cluster so don’t do that either Smile

The “No Down Time” Fix

This is not the first time we need to dive in to the SCVMM database to fix issues. One of my main beef about SCVMM other than inconsistency with the other tools and its lack of control & options in some scenarios is the fact that it doesn’t have enough self-maintenance intelligence & functionality. This leads to the workaround above which are slow and rather annoying or consist of messing around in the SCVMM database, which isn’t exactly supported. Mind you Microsoft has published some T-SQL to clean up such issues themselves. See You cannot delete a missing VM in SCVMM 2008 or in SCVMM 2008 R2 and RemoveMissingVMs. See also my blog SCVMM 2008 R2 Phantom VM guests after Blue Screen post on this subject.

The usual tricks of the trade like refreshing the virtual machine configuration in the failover cluster GUI don’t work here. Neither does the solution to this error described Migrating a System Center Virtual Machine Manager 2008 VM from one cluster to another fails with error 12711. The error is the same but not the cause.

# Add the VMM cmdlets
Add-PSSnapin microsoft.systemcenter.virtualmachinemanager

# Connect to the VMM server
Get-VMMServer –ComputerName MySCVMMServer.test.lab

# Grab the problematic VM and put it into the object $vm
$vm = Get-VM –name “HopelessVM”

#Force a refresh
refresh-vm -force  $vm

In the end we have to fix the mismatch between the VMResourceGroupID in failover cluster and SCVMM by editing the database.

First you navigate to the registry key HKEY_LOCAL_MACHINEClusterGroups on one the cluster nodes, do a find for the problematic VM’s name and grab the name of its key, this is the VMResourceGroupID the cluster knows and works with? So now we have the correct VMResourceGroupID: 0f8cabe4-f773-4ae4-b431-ada5a3c9926c

clip_image002

Now you connect to the SCVMM database and run following query to find the VMResourceGroupID that SCVMM thinks that VM has and that it uses causing the issues

SELECT  VMResourceGroupID  FROM tbl_WLC_VMInstance WHERE ComputerName = 'hopelessVM.test.lab'
GO 

The results:

VMResourceGroupID

————————————————–

df43bf60-7216-47ed-9560-7561d24c7dc8

(1 row(s) affected)

The trick than is to simply update that value to the one you just got from the registry by running:

UPDATE tbl_WLC_VMInstance SET VMResourceGroupID = '0f8cabe4-f773-4ae4-b431-ada5a3c9926c' WHERE VMResourceGroupID = 'df43bf60-7216-47ed-9560-7561d24c7dc8'
GO 

Than you need some patience & refresh the GUI a few times. Things will turn back to normal, but in between you might seem some “missing” statuses appear for your problematic VM. These go away fast however. If not you can always use the Microsoft provided script to remove missing VM’s as mentioned above in RemoveMissingVMs.

Warning

What I described above is something you can do to fix these issues fast and effectively when needed. But I’m not telling you this is the way to go, let alone that this is supported. Make sure you have backups of your VMs, Hosts, SCVMM database etc. It only takes one mistake or misinterpretation to royally shoot yourself in your foot Winking smile. It hurts like hell; recovery is long and seldom complete. On top of that it might generate a vacancy in your company whilst you’re escorted out of the building. Be careful out there.