Software-Defined Data Infrastructure Essentials

The last few months I spent some of my down time and commute time reading a book. A paper one actually. It’s Greg Schulz’s “Software-Defined Data Infrastructure Essentials”. It is as the sub title states about cloud, converged and virtual fundamental server storage I/O tradecraft.

It is not a book you’ll read to learn about a particular technology, product or vendor. It is a more holistic approach to educating people in todays IT landscape. That vast area of expertise in which all the considerations around storage in a modern IT environment come together. Where old and new, established and emerging ways of handling storage IO for a variety of  use cases meet and mix.

image

Reading the book helps to become more well versed in the subject and takes us out of our product or problem specific cocoons. That the main reason I’d recommend anyone to read it. I’m impressed by how well Greg managed to write a book on such a diverse subject that is accessible to all levels of expertise.The depth and the breadth of this subject make this quite a feat. On top of that this book is usable and valuable to both novice and experienced  professionals. I have said it before (on Twitter), but if I was teaching IT classes and needed to bring the student up to date in regards to the software defined cloud data center data considerations this would be the text book. It acknowledges the diversity of solutions and architectures in the real world and doesn’t make bold marketing statements. Instead it focuses on what you need to know and consider when discussing and designing solutions. I wish many IT manager, consultant and analyst would attend my fictional class but I’d settle for them reading this book and learning about a big part of what they need to manage, It would serve them well and help understand concerns other involved parties might want to see addressed.

For me an extra benefit was that I enjoy talking shop with Greg but I only get those opportunities on rare occasions during conferences.  As such, this book gave me some more time to read his views and insights. That’s the best next thing.

Webinar: 3 Emerging Technologies that will change the way you use Hyper-V

I have the pleasure of doing a webinar on October 24th 2017 with two fellow MVPs. They are Andy Syrewicze  from Altaro who organize the webinar and Thomas Mauer who’s well know expert in the tech community and beyond on Cloud and Data Center technologies.

The subject of the webinar is 3 Emerging Technologies that will change the way you use Hyper-V. It’s a panel style discussion amongst the 3 of us technology and trends that effect everyone in this business.

  • Public cloud computing platforms such as Azure and AWS
  • Azure Stack and the complete abstraction of Hyper-V
  • Containers and microservices: why they are game changers

image

There will be time for questions and discussions as we expect the subject to be of great interest to all. For my part I’ll try to look at the bigger picture of the technologies both from a product a service perspective as well as from a strategic point of view and a part of a doctrine to achieve an organizations goals.

Windows Data Deduplication and Cluster Operating System Rolling Upgrades

Introduction

Have you considered when Windows data deduplication and cluster operating system rolling upgrades from Windows Server 2012 R2 to Windows Server 2016 Clusters are discussed we often hear people talk about Hyper-V or Scale Out File Server clusters, sometimes SQL but not very often for a General-Purpose File Share server with continuous availability. Which is the kind I’ve done quite a number of actually.

clip_image002

Being active in an industry that produces and consumes file data in large quantities and sizes we have been early implementers of Windows cluster for General-Purpose File Shares with continuous availability. This provides us the benefits of SMB 3, ODX for both the clients and IT Operations for workloads that are not suited for Scale Out File Server deployments.

As such we have dealt with a number of Windows 2012 R2 GPFS with GA cluster that we wanted to move to Windows Serer 2016. Partially to keep the environment up to date and partially because we want to leverage the new Windows deduplication capabilities that this OS version offers. The SOFS and Hyper clusters that I upgraded didn’t have data deduplication enabled.

The process to perform this upgrade is straight forward and has been documented well by others as well as by me in regards to issues we saw in the field . We even dove behind the scenes a bit in Cluster Operating System Rolling Upgrade Leaves Traces. I have also presented on this topic in public at conferences around Europe (Ireland, Germany and Belgium) as part of our community contributions. No surprises there.

Test your assumptions

This is a scenario you can perform without any downtime for your clients when all things go well. And normally it should. I have upgraded a couple of Scale Out File Server (SOFS) and General-Purpose File Server (GPFS) cluster with Continuous availability now and those went very well. Just make sure your cluster is perfectly healthy at the start.

Naturally there are some check you need to make that are outside of Microsoft scope:

I’m pretty sure you have good backups for your file data and you should check this works with Windows Server 2016 and how it reacts during the upgrade while the server is in mixed mode. Perhaps you will or won’t be able to run backups or restore data. Check and know this.

Verify your storage solution supports and words with Windows Server 2016. It sounds obvious but I have seen people forget such details.

Another point of attention is any Anti-Virus you might have running on the file server cluster nodes. Verify that this is fully supported on Windows Server 2016. On top of that validate that the Anti-Virus still works well with ODX so you don’t run into surprises there. Don’t assume anything.

Check if the server and it components (HBA, NICs, BIOS, …) its firmware and drivers support Windows Server 2016. Sure, the rolling upgrade allows for some testing before committing but that doesn’t mean you should go ahead blindly into the unknown.

Make sure your nodes are fully patched before and after the upgrade of a cluster node.

As the file server cluster is already leveraging SMB 3 with continuous availability al the prerequisites to make that work are already take care of. If you are upgrading a File server cluster without continuous availability and are planning to start using this, that’s another matter and you’ll need to address any issues. You can do this before or after moving to Windows Server 2016. This means you’d move to a solution before you upgrade or after you have performed the upgrade to Windows Server 2016.

You can take a look at my blogs on this subject from the Windows 2012 R2 time frame such as More Tips On Dealing With Removing Short File Names When Migrating To a SMB3 Transparent Failover File Server Cluster, Migrate an old file server to a transparent failover file server with continuous availability and SMB 3, ODX, Windows Server 2012 R2 & Windows 8.1 perform magic in file sharing for both corporate & branch offices

Data deduplication takes some extra consideration

I have blogged before on how Windows Server 2016 Data Deduplication performs and scales better than it did it Windows Server 2012 R2. This also means that it works at least partially different than it did Windows Server 2012 R2. You can see this in some of the updates that came out in regards to a data corruption bug with data deduplication which only affected Windows Server 2016.

clip_image003

Given this difference, what would happen if you fail over a LUN with deduplication enable from Windows Server 2012 to Windows Server 2016 and vice versa? That’s the question I had to consider when combining Windows data deduplication and cluster operating system rolling upgrades for the first time.

Windows Server 2016 is backward compatible and will work just fine with a LUN that from and Windows Server 2012 Server that has Windows data deduplication enabled. The reverse is not the case. Windows Server 2012 R2 is not forward compatible. When dealing with data deduplication in an Operating System Rolling Upgrade scenario I’m extra careful as I cannot guarantee any LUN movement scenario will go well. With a standalone server

Once I have failed over a LUN to Windows Server 2016 node in a mixed cluster I avoid moving it back to a Windows Server 2012 R2 node in that cluster. I only move them between Windows Server 2016 nodes when needed.

I move through the rolling upgrade as fast as I can to minimize the time frame in which a LUN with data deduplication could end up moving from a Windows Server 2016 o a Windows Server 2012 cluster node.

Should I need to reverse the Operating System Rolling Upgrade to end up with a Windows Server 2012 R2 cluster again I’ll make absolutely sure I can restore the data from LUNs with data deduplication from backup and/or a snapshot from a SAN or such. You cannot guarantee that this will work out fine. So be prepared.

For “standard” non deduplicated NTFS LUNs you can fail back if needed. When data deduplication is enabled you should try to avoid that and be prepared to restore data if needed.

Final advise is always the same

Even when you have tested your upgrade scenario and made sure your assumptions are correct you must have a way out. And as always, “One is none, two is one”.

As always during such endeavors you need to make sure that you have a roll back scenario in things do not work out. You must also have a fail back plan for when things turn really bad. For most scenarios has the ability to return to the original situation built in. But things can go wrong badly and Murphy’s Law does apply. So also have the backups and restore verified just in case.

The last thing you need after a failed upgrade is telling your customer or employer “it almost worked” but unfortunately, they’ve lost that 200TB of continuous available data. Better next time doesn’t really cut it.

Microsoft Pulled KB4036479 for Windows Server 2012 R2

Nothing like coming back from a holiday to find out the quality assurance of Windows updates has cause some issues once again. What saved the day here is a great colleague who identified the problem, declined the update in WSUS and removed it from the affected machines. Meanwhile, Microsoft Pulled KB4036479 for Windows Server 2012 R2.

KB4036479 was to eliminated the restart that occurs during initial machine configuration (IMC) with with Windows Server 2012 R2. But after a the “successful” update it does the post install reboot, rolls it back and that process starts all over. This happened to both Windows Server 2012 R2 VMs on premises as well as in Azure IAAS. For now it has been pulled form the Microsoft Update Catalog (https://www.catalog.update.microsoft.com/Search.aspx?q=KB4036479). The issues has been discussed on the forums here.

Again, it pays to deploy and test Windows update to a lab or proving grounds environment that mimics your environment before you let it lose on your production environment. Be practical here and don’t let the desire for a perfect but non existent lab be the enemy of good, existing and usable one!

PS: Some people reported issues with KB4038774 as well, but that does not turn out to be the case. In any way these preview updates have no business being installed on production servers and I wish Microsoft would put them in a separate category so they are not detected / downloaded / approved with other production updates but allow for ease deployment /use in proving ground environments.