Cluster Validation Bug In Windows 2008 R2 SP1 – Disk has a Persistent Reservation on it

Pretty soon after the RTM of Windows 2008 R2 SP1 release we were discussing a bug on the TechNet forum (Hyper-V Cluster issues after applying Win2008 R2 SP1 on a 3 node Cluster!) here. If you have a Windows 2008 R2 SP1 cluster with more than 2 nodes you get the following warning:

List Potential Cluster Disks

Disk with identifier 2sef8cdf has a Persistent Reservation on it. The disk might be part of some other cluster. Removing the disk from validation set

“Normally” you would expect such a warning if the LUN ever belonged to another cluster and it needs the old reservation cleared. To do that you would use following command on the node that throws the warning (where in this example the disk is disk 2 in disk manager/diskpart) and after making sure it is not in use anywhere else in the SAN

"cluster node clusternode1 /clearpr:2"

However this is not the cause here as were most others in this discussion. And I’m pretty no san software or MPIO software is putting a reservation on there either so what is this? A bug? Well yes, it has been confirmed by Microsoft support that is is indeed a bug an that is fix will be made available by April 18th2011 .

This was not a show stopper bug, but it could be one if you needed to add a host to a cluster and confirm all is well and supported. However if you’re certain you’ve done everything right you can choose not to run cluster validation.

I will update this blog with more information when the fix becomes available.

UPDATE:  The hotfix has become available today, April 26th 2011 as announced on the TechNet forum here:

A hotfix is now available that addresses the Win2008 R2 service pack 1 issue with Validate on a 3+ node cluster.  This is KB 2531907.  The KB article and download link will be published shortly, in the mean time you can obtain this hotfix immediately free of charge by calling Microsoft support and referencing KB 2531907. Update 27/05/2011 Here is the link: http://support.microsoft.com/kb/2531907/en-us?sd=rss&spid=14134

New Spatial & High Availability Features in SQL Server Code-Named “Denali”

The SQL Server team is hard at work on SQL Server vNext, code name “Denali”. They have a whitepaper out on their web site, “New Features in SQL Server Code-Named “Denali” Community Technology Preview 1” which you can download here.

As I do a lot of infrastructure work for people who really dig al this spatial and GIS related “stuff” I always keep an eye out for related information that can make their lives easier an enhance the use of the technology stack they own.  Another part of the new features coming in “Denali” is Availability Groups. More information will be available later this year but for now I’ll leave you with the knowledge that it will provide for Multi-Database Failover, Multiple Secondaries, Active Secondaries, Fast Client Connection Redirection, can run on Windows Server Core & supports Multisite (Geo) Clustering as shown in the Microsoft (Tech Ed Europe, Justin Erickson) illustration below.

Availability Group can provide redundancy for databases on both standalone instances and failover cluster instances using Direct Attached storage (DAS), Network Attached Storage (NAS) and Storage Area Networks (SAN) which is useful for physical servers in a high availability cluster and virtualization. The latter is significant as they will support it with Hyper-V Live Migration where as Exchange 2010 Database Availability Groups do not. I confirmed this with a Microsoft PM at Tech Ed Europe 2010.  Download the CTP here and play all you want. Please pay attention to the fact that in CTP 1 a lot of stuff  isn’t quite ready for show time. Take a look at the Tech Europe 2010 Session on the high availability features here. You can also download the video and the PowerPoint presentation via that link. At first I thought MS might be going the same way with SQL as they did with Exchange, less choice in high availability but easier and covering all needs but than I don’t think they can. SQL Server Applications are beyond the realm of control of Redmond. They do control Outlook & OWA. So I think the SQL Server Team needs to provide backward compatibility and functionality way more than the Exchange team has. Brent Ozar (Twitter: @BrentO)  did a Blog on “Denali”/Hadron which you can read here http://www.brentozar.com/archive/2010/11/sql-server-denali-database-mirroring-rocks/. What he says about clustering is true. I’ use to cluster Windows 2000/2003 and suffered some kind of mental trauma. That was completely cured with Windows 2008 (R2) and I’m now clustering with Hyper-V, Exchange 2010, File Servers, etc. with a big smile on my face. I just love it!

Windows 2008 R2: The system image restore failed. Error details: The parameter is incorrect. 0x80070057

Note to self: read your own blogs on Windows 2008 R2 Native Backup :-). Yes people, Windows 2008 R2 Bare Metal restore to dissimilar hardware does work as long as you follow the rules and guidelines. Those are not super evidently documented but still, if I can find ‘m you can too! But today we lost some time because we didn’t head one of the rules that trip people up frequently. That rule is that the disk layout on the restore server can’t differ from the original one. I literally wrote “Pay close attention to the disk layout/ boot order as well, the restore doesn’t allow for variation from the original layout” in https://blog.workinghardinit.work/2010/01/27/using-windows-2008-r2-backups-to-go-virtual-2/. That means you need to simulate the same disk layout on the new hardware. If the new server has an extra disk, disable that one for the restore, if it has one less, add one. Another situation where the disk layout comes into play is when you boot from an USB stick with W2K8R2. If you leave it plugged in there during the restore the recovery will fail. Because if that extra attached disk isn’t the one containing the backup image you’ll get a very harsh error:

“The system image restore failed. Error details: The parameter is incorrect. 0x80070057”

image

Not very helpful in explaining but that generally means you’ve got a disk layout issue. In this case because you have the bootable USB stick attached. Once you’ve booted to the “Repair your computer” functionality, selected “Select a system image backup” and found your image to restore you should remove the bootable USB stick from the server if you’re not going to be doing an install. Beware of this! Typically when you boot from DVD or PXE you wouldn’t even notice but when using a bootable USB device with W2K8R2 you might forget that this changes the disk layout. So again, always pull the bootable USB stick from the server before you restore and you’ll be fine. Yes the recovery will work a soon as you’ve booted, you don’t need the media anymore so you can unplug it safely. You can even attach another USB disk in its place containing the backups if you only have one USB port available. That will work because the disk with the backup itself is never taken into consideration and won’t cause any issues with the restore.

So we’ll never forget to head our own warnings again (I hope). The good thing is we had some refresh training on restoring today and it’s all refreshed in our minds 🙂