Anti Virus & Hyper-V Reloaded

The anti virus industry is both a blessing and a curse.  They protect us from a whole lot of security threats and at the same time they make us pay dearly for their mistakes or failures. Apart from those issues themselves this is aggravated that management does not see the protection it provides on a daily basis. Management only notices anti virus when things go wrong, when they lose productivity and money. And frankly when you consider scenarios like this one …

Hi boss, yes, I know we spent a 1.5 million Euros on our virtualization projects and it’s fully redundant to protect our livelihood. Unfortunately the anti virus product crashed the clusters so we’re out of business for the next 24 hours, at least.

… I can’t blame them for being a bit grumpy about it.

Recently some colleagues & partners in IT got bitten once again by McAfee with one of there patches (8.8 Patch 1 and 8.7 Patch 5). These have caused a lot of BSOD reports and they put the CSVs on Hyper-V clusters into redirected mode (https://kc.mcafee.com/corporate/index?page=content&id=KB73596). Sigh. As you can read here for the redirected mode issue they are telling us Microsoft will have to provide a hotfix. Now all anti virus vendors have their issue but McAfee has had too many issues for to long now.  I had hoped that Intel buying them would have helped with quality assurance but it clearly did not. This only makes me hope that whatever protection against malware is going to built into the hardware will be of a lot better quality as we don’t need our hardware destroying our servers and client devices. We’re also no very happy with the prospect or rolling out firmware & BIOS updates at the rate and with the risk of current anti virus products.

Aidan Finn has written before about the balance between risk & high availability when it comes to putting anti virus on Hyper-V cluster hosts and I concur with his approach:

  • When you do it pay attention to the exclusion & configuration requirements
  • Manage those host very carefully, don’t slap on just any update/patches and this includes anti virus products of cause

I’m have a Masters in biology from they days before I went head over heals into the IT business. From that background I’ve taken my approach to defending against malware. You have to make a judgment call, weighing all the options with their pros and cons. Compare this to vaccines/inoculations to protect the majority of your population. You don’t have to get a 100% complete coverage to be successful in containing an outbreak. Just a sufficiently large enough part including your most vulnerable and most at risk population. Excluding the Hyper-V hosts from mandatory anti virus fits this bill. Will you have 100% success, always? Forget it. There is no such thing.

Hotfixes For Hyper-V & Failover Clustering Can Be Confusing KB2496089 & KB2521348

As I’m building or extending a number of Hyper-V Clusters in the next 4 months I’m gathering/updating my list with the Windows 2008 R2 SP1 hotfixes relating to Hyper-V and Failover Clustering. Microsoft has once published KB2545685: Recommended hotfixes and updates for Windows Server 2008 R2 SP1 Failover Clusters but that list is not kept up to date, the two hotfixes mentioned are in the list below. I also intend to update my list for Windows Server 2008 SP2 and Windows 2008 R2 RTM. As I will run into to these and it’s nice to have a quick reference list.

I’ll include my current list below. Some of these fixes are purely related to Hyper-V, some to a combination of hyper-V and clusters, some only to clustering and some to Windows in general. But they are all ones that will bite you when running Hyper-V (in a failover cluster or stand alone). Now for the fun part with some hotfixes I’ll address in this blog post. Confusion Smile Take a look at the purple text and the green text hotfixes and the discussion below. Are there any others like this I don’t know about?

* KB2496089 is included in SP1 according to “Updates in Win7 and WS08R2 SP1.xls” that can be downloaded here (http://www.microsoft.com/download/en/details.aspx?displaylang=en&id=269) but the Dutch language KB article states it applies to W2K8R2SP1 http://support.microsoft.com/kb/2496089/nl

Artikel ID: 2498472 – Laatste beoordeling: dinsdag 10 februari 2011 – Wijziging: 1.0

Vereisten

Deze hotfix moet worden uitgevoerd een van de volgende besturings systemen:

  • Windows Server 2008 R2
  • Servicepack 1 (SP1) voor Windows Server 2008 R2
Voor alle ondersteunde x64 versies van Windows Server 2008 R2

6.1.7600.20881
4,507,648
15-Jan-2011
04: 10
x64

Vmms.exe
6.1.7601.21642
4,626,944
15-Jan-2011
04: 05
x64

When you try to install the hotfix it will. So is it really in there? Compare file versions! Well the version after installing the hotfix on a W2K8R2 SP1 Hyper-V server the version of vmms.exe was 6.1.7601.21642 and on a Hyper-V server with SP1 its was 6.1.7061.17514. Buy the way these are English versions of the OS, no language packs installed.

With hotfix installed on SP1

Withhotfix_thumb[1]

Without hotfix installed on SP1

Withoutpatch_thumb[1]

To make matters even more confusing while the Dutch KB article states it applies to both W2K8R2 RTM and W2K8R2SP1 but the English version of the article has been modified and only mentions W2K8R2 RTM anymore.

http://support.microsoft.com/kb/2496089/en-us

Article ID: 2496089 – Last Review: February 23, 2011 – Revision: 2.0

For all supported x64-based versions of Windows Server 2008 R2

Vmms.exe
6.1.7600.20881
4,507,648
15-Jan-2011
04:10
x64

So what gives? Has SP1 for W2K8R2 been updated with the fix included and did the SP1 version I installed (official one right after it went RTM) in the lab not yet include it? Do the service packs differ with language, i.e. only the English one got updated?. Sigh :-/ Now for the good news: ** It’s all very academic because of this KB 2521348 A virtual machine online backup fails in Windows Server 2008 R2 when the SAN policy is set to “Offline All” which brings the vmms.exe version to 6.1.7601.21686 and this hot fix supersedes KB2496089 Smile. See http://blogs.technet.com/b/yongrhee/archive/2011/05/22/list-of-hyper-v-windows-server-2008-r2-sp1-hotfixes.aspx where this is explicitly mentioned.

Ramazan Can mentions hotfix 2496089 and whether it is included in SP1 in the comments on his blog post http://ramazancan.wordpress.com/2011/06/14/post-sp1-hotfixes-for-windows-2008-r2-sp1-with-failover-clustering-and-hyper-v/ but I’m not very convinced it is indeed included. The machine I tested on are W2K8R2 English RTM updated to SP1, not installations for the media including SP1 so perhaps there could also be a difference. It also should not matter that if you install SP1 before adding the Hyper-V role, so that can’t be the cause.

Anyway, keep your systems up to date and running smoothly, but treat your Hyper-V clusters with all due care and attention.

  1. KB2277904: You cannot access an MPIO-controlled storage device in Windows Server 2008 R2 (SP1) after you send the “IOCTL_MPIO_PASS_THROUGH_PATH_DIRECT” control code that has an invalid MPIO path ID
  2. KB2519736: Stop error message in Windows Server 2008 R2 SP1 or in Windows 7 SP1: “STOP: 0x0000007F”
  3. KB2496089: The Hyper-V Virtual Machine Management service stops responding intermittently when the service is stopped in Windows Server 2008 R2
  4. KB2485986: An update is available for Hyper-V Best Practices Analyzer for Windows Server 2008 R2 (SP1)
  5. KB2494162: The Cluster service stops unexpectedly on a Windows Server 2008 R2 (SP1) failover cluster node when you perform multiple backup operations in parallel on a cluster shared volume
  6. KB2496089: The Hyper-V Virtual Machine Management service stops responding intermittently when the service is stopped in Windows Server 2008 R2 (SP1)*
  7. KB2521348: A virtual machine online backup fails in Windows Server 2008 R2 (SP1) when the SAN policy is set to “Offline All”**
  8. KB2531907: Validate SCSI Device Vital Product Data (VPD) test fails after you install Windows Server 2008 R2 SP1
  9. KB2462576: The NFS share cannot be brought online in Windows Server 2008 R2 when you try to create the NFS share as a cluster resource on a third-party storage disk
  10. KB2501763: Read-only pass-through disk after you add the disk to a highly available VM in a Windows Server 2008 R2 SP1 failover cluster
  11. KB2520235: “0x0000009E” Stop error when you add an extra storage disk to a failover cluster in Windows Server 2008 R2 (SP1)
  12. KB2460971: MPIO failover fails on a computer that is running Windows Server 2008 R2 (SP1)
  13. KB2511962: “0x000000D1” Stop error occurs in the Mpio.sys driver in Windows Server 2008 R2 (SP1)
  14. KB2494036: A hotfix is available to let you configure a cluster node that does not have quorum votes in Windows Server 2008 and in Windows Server 2008 R2 (SP1)
  15. KB2519946: Timeout Detection and Recovery (TDR) randomly occurs in a virtual machine that uses the RemoteFX feature in Windows Server 2008 R2 (SP1)
  16. KB2512715: Validate Operating System Installation Option test may identify Windows Server 2008 R2 Server Core installation type incorrectly in Windows Server 2008 R2 (SP1)
  17. KB2523676: GPU is not accessed leads to some VMs that use the RemoteFX feature to not start in Windows Server 2008 R2 SP1
  18. KB2533362: Hyper-V settings hang after installing RemoteFX on Windows 2008 R2 SP1
  19. KB2529956: Windows Server 2008 R2 (SP1) installation may hang if more than 64 logical processors are active
  20. KB2545227: Event ID 10 is logged in the Application log after you install Service Pack 1 for Windows 7 or Windows Server 2008 R2
  21. KB2517329: Performance decreases in Windows Server 2008 R2 (SP1) when the Hyper-V role is installed on a computer that uses Intel Westmere or Sandy Bridge processors
  22. KB2532917: Hyper-V Virtual Machines Exhibit Slow Startup and Shutdown
  23. KB2494016: Stop error 0x0000007a occurs on a virtual machine that is running on a Windows Server 2008 R2-based failover cluster with a cluster shared volume, and the state of the CSV is switched to redirected access
  24. KB2263829: The network connection of a running Hyper-V virtual machine may be lost under heavy outgoing network traffic on a computer that is running Windows Server 2008 R2 SP1
  25. KB2406705: Some I/O requests to a storage device fail on a fault-tolerant system that is running Windows Server 2008 or Windows Server 2008 R2 (SP1) when you perform a surprise removal of one path to the storage device
  26. KB2522766: The MPIO driver fails over all paths incorrectly when a transient single failure occurs in Windows Server 2008 or in Windows Server 2008 R2

KB Article 2522766 & KB Article 2135160 Published Today

At this moment in time I don’t have any more Hyper-V clusters to support that are below Windows Server 2008 R2 SP1. That’s good as I only have one list of patches to keep up to date for my own use. As for you guys still taking care of Windows 2008 R2 RTM Hyper-V cluster you might want to take a look at KN article 2135160 FIX: "0x0000009E" Stop error when you host Hyper-V virtual machines in a Windows Server 2008 R2-based failover cluster that was released today. The issue however is (yet again) an underlying C-State issue that already has been fixed in relation to another issue published as KB article 983460 Startup takes a long time on a Windows 7 or Windows Server 2008 R2-based computer that has an Intel Nehalem-EX CPU installed.

And for both Windows Server 2008 R2 RTM and SP1 you might take a look at an MPIO issue that was also published today (you are running Hyper-V on a cluster and your are using MPIO for redundant storage access I bet) KB article 2522766 The MPIO driver fails over all paths incorrectly when a transient single failure occurs in Windows Server 2008 or in Windows Server 2008 R2

It’s time I add a page to this blog for all the fixes related to Hyper-V and Failover Clustering with Windows Server 2008 R2 SP1 for my own reference Smile