Anyone who’s been doing virtualization with Hyper-V on Windows 2008 R2 has a good change of having seen the issue described in http://support.microsoft.com/kb/974909/en-us
You install the Hyper-V role on a computer that is running Windows Server 2008 R2.
- You run a virtual machine on the computer.
- You use a network adapter on the virtual machine to access a network.
- You establish many concurrent network connections, or there is heavy outgoing network traffic.
In this scenario, the network connection on the virtual machine may be lost. Additionally, the network adapter is disabled.
Note You have to restart the virtual machine to recover from this issue.
We’ve seen this one on VM’s that have indeed a lot of outgoing traffic. In our environment the situation looks like this:
- You can access the VM with Hyper-V Manager or SCVMM but not via RDP as all Network connectivity is lost. The status the guest NIS is always “Enabled” but there is no traffic/connectivity
- You can try to disable the NIC but this tales a very long time and when you try to enable it again this never succeeds. Disconnecting the NIC form the virtual network and connecting it again doesn’t help either.
- You need to shut down the host but this takes an extremely long time, so long you really can’t afford to wait if it ever succeeds. It seems to hang at shutting down with a “non whirling whirly”. So finally you’ll power off the VM and start it up again. Apart from entries related to having not connectivity the event logs are “clean” and there is no indication as to what happened.
Well this exact same issue is back with Windows 2008 R2 SP1. That’s the bad news. The good news is there is a hotfix for it already so you can fix it. You can read up on this issue in Knowledge Base article 2263829 and request the hotfix here. Instructions to get the hotfix are in there as well as a reference to the previous fixes for Windows 2008 R2 RTM.
Consider the following scenario:
- You install the Hyper-V role on a computer that is running Windows Server 2008 R2 Service Pack 1 (SP1).
- You run a virtual machine on the computer.
- You use a network adapter on the virtual machine to access a network.
- You establish many concurrent network connections. Or, there is heavy outgoing network traffic.
In this scenario, the network connection on the virtual machine may be lost. Additionally, the network adapter may be disabled.
Notes
- You must restart the virtual machine to recover from this issue.
- This issue can also occur on versions of Windows Server 2008 R2 that do not have SP1 installed. To resolve the issue, apply the hotfix that is described in one of the following Microsoft Knowledge Base articles:
974909 (http://support.microsoft.com/kb/974909/ ) The network connection of a running Hyper-V virtual machine is lost under heavy outgoing network traffic on a Windows Server 2008 R2-based computer
2264080 (http://support.microsoft.com/kb/2264080/ ) An update rollup package for the Hyper-V role in Windows Server 2008 R2: August 24, 2010
Oh yeah, people often seem confused as to where to install the hotfix. Does it go on the Hyper-V hosts or and/or on the guest? It’s a hyper visor bug in Hyper-V so it goes on the hosts. Have a nice weekend.
The NIC reset issue has been the bane of my existence for the last six months. At least 3 hotfixes have been installed – one via batch file written by Microsoft after a support call- and none of it has worked 100%.
At best we’ve minimized the reset issues on 2003 32bit VMs, but now 64bit R2 boxes are getting NIC resets.
One thing I haven’t seen addressed is everyone using Synthetic NICs? We do because it’s recommended and the guests stay on better through a live migration. But we’re seriously thinking of using emulated NICs.
Easter Weekend is our big SP1 + hotfix upgrade so hopefull thingis will be better next week.
Hi, Sorry to here that. We’ve been doing great with both hotfixes. Only other issues we had we’re with early NIC drivers (early adopter of hyper-V) on hosts, but that’s long been solved.
Good luck, when doing SP1 on a W2K8R2 cluster pplease take note of this current issue https://blog.workinghardinit.work/2011/04/08/cluster-validation-bug-in-windows-2008-r2-sp1-disk-has-a-persistent-reservation-on-it/ , a fix is on the way …
Hi there again!
Thanks for the feedback.
We’re really excited here. The cluster validation hotfix has been released and the feedback on the NIC hotfix has been good. Have you had any crashes of virtual nics in the last week?
Thanks for your great blog. I’ve got my entire team reading it now.
Your welcome. The guest NIC issue has not come back after the hotfix. That looks good. I’ve installed the cluster valdidation hotfix today on the first cluster and that went wel. Validation is scheduled for after hours tomorrow. I’ll let you know the outcome by friday night.
Thx for you positive feedback. It’s great to read 🙂
Hi, the validations come out fine with the hotfix installed. Do make sure you put it on every node where you have the Failover Cluster Manager Console installed. Sorry for not getting back to you fridaynight. Things are bussy.
Our upgrade went about as well as we could hope. I shut down each VM to make migration easier, then upgraded Node1 with SP1 + NIC Hotfix and Cluster Hotfix. Three restarts later and Node 1 was back up. Repeated the process for Nodes 2-4.
Then I deployed new Integration Components via SCCM. That took the bulk of the weekend work.
The bad news is we haven’t yet passed Cluster Validation. It errors out on storage with “status code 87” which may be related to system partitions (bit locker). We’re investigating. But at least it didn’t corrupt our VHDs.
I’m hopeful on the crashing NICs. That problem has single-handedly made my boss regret going to Hyper V. How do you explain to a business that a bug is causing a virtual NIC on an important server to crash?
It’s Monday morning here so we’ll find out soon enough. Thanks for the blog again!
I think you can be very sure about the hotfix for the outgoing traffic bug. The cluster validation you’re seeing now is not related to the validation bug I think. Windows Server 2008 R2 supports enabling BitLocker on a non-shared disk on a failover cluster node. But I haven’t done that yet. I’ve read about Staus code 87 here http://mrhodes.net/2010/06/09/status-code-87-when-trying-to-validate-windows-server-2008-r2-failover-cluster-8/ and that was when you have partitions with no drive letter / mount point assigend don the primary disk. I have not seen that myself.
Good luck.
I’ve done some comepting vitualization solutions also and I’ve seen serious issues with “Service Packs” & updates there as well. It comes with the job & technology I’m afraid and you can regret any technology when it comes to the day you have issues with it 🙂
I have a 2003 r2 VM running under a non-R2 2008 Hyper-V host. The NIC reset “bug” is affecting it, and the Integration Services updates therefore don’t (and won’t!) apply to my system. Clues greatfully received.
I know of several hotfixes for Windows Server 2008 R2 with windows 2003 virtual machines but not for Windows 2008. I would suggest trying in the Hyper-V newsgroups on TechNet or contacting Microsoft support.