In-place upgrade of an Azure virtual machine

Introduction

In the cloud it’s all about economies of scale, automation, wipe and (redeploy). Servers are cattle to be destroyed and rebuild when needed. And “needed” here is not like in the past. It has become the default way. But not every one or every workload can achieve that operational level.

There are use cases where I do use in place upgrades for my own infrastructure or for very well known environments where I know the history and health of the services. An example of this is my blog virtual machine. Currently that is running on Azure IAAS, Windows Server 2016, WordPress 4.9.1, MySQL 5.7.20, PHP. It has never been reinstalled.

I upgraded my WordPress versions many times for both small incremental as major releases. I did the same for my MySQL instance used for my blog. The same goes for PHP etc. The principal here is the same. Avoid risk of tech debt, security risks and major maintenance outages by maintaining a modern platform that is patched and up to date. That’s the basis of a well running and secure environment.

In-place upgrade of an Azure virtual machine

In this approach updating the operating system needs to be done as well so my blog went from Windows Server 2012 R2 to Windows Server 2016. That was also an in place upgrade. The normal way of doing in place upgrades, from inside the virtual machine is actually not supported by Azure and you can shoot yourself in the foot by doing so.

The reason for this is the risk. You do not have console access to an Azure IAAS virtual machine. This means that when things go wrong you cannot fix it. You will have to resort to restoring a backup or other means of disaster recovery. There is also no quick way of applying a checkpoint to the VM to return to a well known situation. Even when all goes well you might lose RDP access (didn’t have it happen yet). But even if all goes well and that normally is the case, you’ll be stuck at the normal OOBE screen where you need to accept the license terms that you get after and upgrade to Windows Server 2016.

clip_image002

The default upgrade will boot to that screen and you cannot confirm it as you have no console access. If you have boot diagnostics enabled for the VM you can see the screen but you cannot get console access. Bummer. So what can you do?

Supported way of doing an in-place upgrade of an Azure virtual machine

Microsoft gives you two supported options to upgrade an Azure IAAS virtual machine in

An in-place system upgrade is not supported on Windows-based Azure VMs. These approaches mitigate the risk. The first is actually a migration to a new virtual machine. The second one is doing the upgrade locally on the VHD disk you download from Azure and then upload to create a new IAAS virtual machine. All this avoids messing up the original virtual machine and VHD.

Unsupported way of doing an in-place upgrade of an Azure virtual machine

There is one  way to do it, but if it goes wrong you’ll have to consider the VM as lost. I have tested this approach in a restored backup of the real virtual machine to confirm it works. But, it’s not supported and you assume all risks when you try this.

Mount your Windows Server 2016 ISO in the Windows Server 2012 R2 IAAS virtual machine. Open an administrative command prompt and navigate to the drive letter (mine was ESmile of the mounted ISO. From there you launch the upgrade as follows:

E:\setup.exe /auto upgrade /DynamicUpdate enable /pkey CB7KF-BWN84-R7R2Y-793K2-8XDDG /showoobe none

The key is the client KMS key so it can activate and the /showoobe none parameter is where the “magic” is at. This will let you manually navigate through the wizard and the upgrade process will look very familiar (and manual). But the big thing here is that you told the upgrade not to show the OOBE screen where you accept the license terms and as such you won’t get stuck there. So fare I have done this about 5 times and I have never lost RDP access due to the in-place upgrade. So this worked for me. But whatever you do, make sure you have a backup, a  way out, ideally multiple ways out!

Note that you can use  /Quiet to automate things completely. See  Windows Setup Command-Line Options

Nested virtualization can give us console access

Since we now have nested virtualization you have an option to fix a broken in-^lace upgrade but by getting console access to a nested VM using the VHD of the VM which upgrade failed. See:

Conclusion

If Microsoft would give us virtual machine console access or  DRAC or ILO capabilities that would take care of this issue. Having said all that, I known that in place upgrades of applications, services or operating systems isn’t the cloud way. I also realize that dogmatic purism doesn’t help in a lot of scenarios so if I can help people leverage Azure even when they have “pre cloud” needs, I will as long as it doesn’t expose them to unmanaged risk. So while I don’t recommend this, you can try it if that’s the only option you have available for your situation. Make sure you have a way out.

IAAS has progressed a tremendous amount over the last couple of years. It still has to get on par with capabilities we have not only become accustomed to but learned to appreciate over over the years. But it’s moving in the right direction making it a valid choice for more use cases. As always when doing cloud, don’t do copy paste, but seek the best way to handle your needs.

CBT DRIVER WITH Veeam Agent for Microsoft Windows 2.1

Change Block Tracking comes to physical & IAAS Veeam Backups

With the big improvements and new capabilities delivered in Veeam Backup & Replication 9.5 Update 3 there are some interesting capabilities and features related specifically to the Veeam Agent for Windows 2.1 Server Edition. We now get the ability to manage the Veeam Agent centrally from within VBR 9.5 UP3 console or PowerShell. This includes deploying the new Change Block Tracking (CBT) driver for Windows Server (not Linux).

This CBT driver is optional and works like you have come to expect from Veeam VBR when backing up Hyper-V virtual machines pre-Windows Server 2016. Windows Server 2016 now has its own CBT capabilities that Veeam VBR 9.5 leveraged. The big thing here is that you now get CBT capabilities for physical or virtual in guest workloads (that includes IAAS people!) with Veeam Agent for Windows 2.1 Server Edition.

Deploying the Veeam Agent for Microsoft Windows 2.1 CBT Driver

The Veeam Agent for Windows 2.1 ships with an optional, signed change block tracking filter driver for Windows servers. That agent is included in your VBR 9.5 Update 3 download or you can choose to download an update that does not have the CBT driver included. That’s up to you. I just upgraded my lab and production environment with the agent included as I might have a use for them. If not now, then later and at least my environment is ready for that.

clip_image001

When you have installed VAW 2.1 you can navigate to C:\Program Files\Veeam\Endpoint Backup\CBTDriver and find the driver files there for the supported Windows Server OS versions under their respective folders.

clip_image003

As you can see in the screenshot above we have CBT drivers for any version of Windows Server back to Windows Server 2008 R2. If you are running anything older we really need to talk about your environment in 2018. I mean it.

Note that right clicking the .inf file for your version of Windows Server and selecting Install is the most manual way of installing the CBT driver. You’ll need to reboot the host.

clip_image005

Normally you’ll either integrate the deployment and updating of the CBT driver into the VBR 9.5 Update 3 console or you’ll deploy and update the CBT driver manually.

Install / uninstall the CBT driver via Veeam Backup & Replication Console

You can add servers individually or as part of a protection group (Active Directory based). Whatever option you chose you’ll have the option of managing them via the agent manually or via VBR server. Once you have done that you can deploy and update the optional CBT driver for supported Windows Server versions via the individual servers or the protection groups.

Individual Server

clip_image007

Once the agent is installed you’ll can optionally install the CBT driver. When that’s you can also uninstall the CBIT driver and the agent form the VBR Console.

clip_image009

Protection Group

You can add servers to protect via VAW 2.1 individually, via active directory (domain, organizational unit, container, computer, cluster or a group) or a CSV file with server names /IP-addresses. That’s another subject actually but you get the gist of what a protection group is.

clip_image011

Checking the CBT driver version

You can always check the CBT driver version via the details of a server added to the physical or cloud infrastructure.

clip_image012

Install / uninstall the CBT driver via the standalone Veeam Agent for Windows

My workstation at home isn’t managed by a Veeam Backup & Replication v9.5 Update 3 server. It’s a standalone system. But it does run Windows Server 2016. Now, even while such a standalone system can send its backups to Veeam Repository, I don’t do that at home. The target is a local disk in disk bay that I can easily swap out every week. I just rotate through a couple of recuperated larger HDDs for this purpose and this also allows me to take a backup copy off site. The Veeam Agent for Windows configuration for my home office workstation is done locally, including the installation of the CBT driver. Doing so is easy. Under settings in the we now have a 3rd entry VAW 2.1 that’s there to install the CBT driver if we want to.

clip_image014

When you click install it will be done before you can even blink. It will prompt you the restart the computer to finish installing the driver. Do so. If not, the next backup will complain about failing over to MFT analysis based incremental backups as you can’t use the installed CBT driver yet.

clip_image016

When using the new VAW 2.1 CBT drivers for windows changes get tracked a VCT file. These can be seen under C:\ProgramData\Veeam\EndpointData\CtStore\VctStore.

clip_image018

Ready to Go

I’ll compare the results of backing up my main workhorse with and without the CBT driver installed. Veeam indicates the use case is for servers with a lot of data churn and that’s where you should use them. The idea is that you don’t need to deal with updating the drivers when the benefits are not there. That’s fair enough I’d say but I’m going to experiment a little with them anyway to see what difference I can notice without resorting to a microscope.

If we conclude that having the CBT driver installed is not worth while for our workstation we can easily uninstall it again via the control panel, under settings, where we now see the option to uninstall it. Easy enough.

clip_image020

However, as it can track changes in NTFS as well as ReFS and FAT partitions it might be wise to use it for those servers that have one or more of such volumes, even when for NFTS volumes the speed difference isn’t that significant. Normally the bigger the data churn delta the bigger the benefits of the CBT driver will be.

Spectre and Meltdown

Introduction

While working on an Active Directory upgrade, a couple of backup repository replacements, a cluster hardware upgrade, a few hybrid cloud projects as well as a couple of all flash array migrations we get word about what we already suspected due to the massive maintenance announced in December in AWS and Azure around our IAAS. Spectre and Meltdown are upon us. I actually did all the maintenance proactively in Azure. So naturally I wondered about the hardware we own, clients, servers, load balancers, storage etc. All driven by CPUs. Darn that will wake you up in the morning. Now to put the doom and gloom in perspective you can read Meltdown and Spectre exploits: Cutting through the FUDand-spectre-exploits-cutting-through-the-fud.html which will help you breathe normally again if you were panicking before.

Getting to work on the problem

Whilst prepping our plan of action for deploying the patches, checking anti-virus compatibility, firmware and BIOS updates I do notice that the CISO / DPO are missing in action. They must have learned we are normally on top of such things and have gotten better at not going along with too much FUD. Bar all the FUD and the incredible number or roadside scholars in CPU design that appear on line to discuss CPU designs we’ve worked out a plan.

The info out there is a bit messy and distributed. But I have a clear picture now on what to do. I’m not going to rehash all the information out there but here are a couple of things I noticed that can cause issues or be confusing.

Microsoft mentioned the patch would not be installed on systems that don’t have the registry key set to indicate there is no compatibility issue with the anti-virus software (Important information regarding the Windows security updates released on January 3, 2018 and anti-virus software)

Note: Customers will not receive these security updates and will not be protected from security vulnerabilities unless their anti-virus software vendor sets the following registry key:

Key=”HKEY_LOCAL_MACHINE”Subkey=”SOFTWARE\Microsoft\Windows\CurrentVersion\QualityCompat”Value=”cadca5fe-87d3-4b96-b7fb-a231484277cc” Type=”REG_DWORD”

This seems not to be working that well. We saw the CU being downloaded and installs attempted on systems with Antivirus that did not have the registry key set. Shouldn’t happen right? We killed the update service to prevent the install, but luckily it seems our anti-virus solution is compatible( (see https://kc.mcafee.com/corporate/index?page=content&id=KB90167 – we have McAfee Enterprise 8.8 patch 10). Now the above article only mentions this registry setting for AV on Windows 2012 R2 / windows 8.1 or lower but the OS specific patches also mention this for Windows Server 2016 / Windows 10. See https://support.microsoft.com/en-us/help/4056890 (Windows Server 2016) for an example. Right …  One of my MVP colleagues also had to add this registry edit manually (or scripted) and need to download the latest Defender definitions for the update to be offered to him. So it seems not just 3rd party anti-virus. Your mileage may very a bit here it seems.

There is a great community effort on anti-virus compatibility info here https://docs.google.com/spreadsheets/d/184wcDt9I9TUNFFbsAVLpzAtckQxYiuirADzf3cL42FQ/htmlview?sle=true#gid=0

Anyway after a reboot you’ll see that the mitigations are active via a PowerShell module you can download.

Install-Module SpeculationControl

Set-ExecutionPolicy -Scope Process -ExecutionPolicy unrestricted

Import-Module SpeculationControl

Get-SpeculationControlSettings

Installing the patches is enough. You only need the extra keys mentioned in article Windows Server guidance to protect against speculative execution side-channel vulnerabilities on Windows 10 if you want to toggle between enabling them or disabling them. You do need to set them for Windows Server! These registry setting changes require a reboot to become effective.

#To disable the mitigations

reg add “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management” /v FeatureSettingsOverride /t REG_DWORD /d 3 /f

reg add “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management” /v FeatureSettingsOverrideMask /t REG_DWORD /d 3 /f

After a reboot you can see the mitigations are disabled. That allows you to work around issues these might incur. It’s better than uninstalling the update.

clip_image002

If you want to enable them for the first time on Windows Server or again after disabling it on Windows Server or Windows 10 do as follows:

#To enable the mitigations. Note that the FeatureSettingsOverrideMask value does remain 3, it’s not changed just left in place.

reg add “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management” /v FeatureSettingsOverride /t REG_DWORD /d 0 /f

reg add “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management” /v FeatureSettingsOverrideMask /t REG_DWORD /d 3 /f

clip_image004

Do find the BIOS/Firmware updates you need from your OEMs and install them as well.If you don’t the output won’t be a nice and green as the above. I’m running tis on a DELL XPS 13 and that BIOS was patched a few days ago. Dell has been prepping for this before we ever knew about Spectre and Meltdown. Here’s an example: http://www.dell.com/support/home/be/nl/bedhs1/drivers/driversdetails?driverId=MXXTN

Here is more guidance by Microsoft for both Hyper-V hosts, virtual machines and SQL Server installations.

This is not the end

This is not the end of the world, but it’s also not the end of the issue. CERT was very clear, probably due to Spectre, that the real fix was to replace the CPUs. So the heat is on and this might very well lead to an accelerated replacement of systems at the bigger cloud players and large hosters. They cant afford to lose 20 to 30 % of CPU performance. I have no insight on the impact they’re seeing, they might shared that in the future, but it could ruin their economies of scale. There’s always money to be made from a crisis right. But I have seen anything between 0 to marginal and 5-20% loss in testing, the latter with SQL workloads. It all depends on what the workloads is asking the CPU to do I guess and if that leverages speculative execution. It’s not going to that bad for most people. But for the cloud players who really max out their systems this might have a bigger impact and their risk profile (mutli-tenant, shared resources) + the potential rewards for hacking them makes it a  big concern and top priority for them. But we have done our homework and can mitigate the issue.

Replacing consumer devices and in all the IoT, sensors, domotics, cars, etc. out there is another matter. I’d be happy to see them all get patched in a reasonable time, which is also a pipe dream and that isn’t likely judging from the current state of things. Security is going to get a whole lot worse before it gets any better

And finally, hats of to the Google engineers who found the issues. Technically it’s quite fascinating even when explained at my level (I am not a CPU designer, so I need the Barney Bear version).

Goodbye 2017, welcome 2018!

Professional

We’re back at work for a couple of day now in 2018. Another year flew by. In 2017 we’ve been busy with letting Windows Server 2016 and Azure based solutions shine across the entire stack of compute, storage and networking. This allow various services and applications to deliver the best possible results on-premises and in the (hybrid) cloud / IAAS. In other words, pretty much what you’d expect a Microsoft Cloud & Datacenter MVP to be doing. The amount of time and money saved by having context, a goal and a plan to evolve your IT environment incrementally and keep it current cannot be ignored. It is significant.

Cloud wise, that word seems to have become whatever vendors / consultants / businesses want it to mean. Very convenient. There are (too) many ways in which cloud solutions are defined the various forms of cloud in an ever-bigger set of permutations. In the end it’s about the best possible solution for the needs at hand at the time you make the decision. Things change fast and we just try to make reasonable future proofed decisions. Sometimes that will be pure “serverless” such as Azure Function. Sometimes it is still very much some form of on-premises virtualization and anything in between. It doesn’t matter. All you need to do is support the organization to deliver what’s needed cost effectively and efficiently. In reality today, that means a lot of options are available. As long as you have a plan, don’t paint your business into a corner in any way and can manage the governance of what you’re doing it’s not too shabby. Especially not when compared too so many that are just being sold to and are just “doing” stuff in a copy / paste effort to avoid looking like they’re not on board with the buzz of “digital transformation”.

When I wear my CTO but perhaps even more the CEO/CIO/CFO role I really do like “serverless”. The big risk to manage today is the utter and gigantic lock-in this is and will remain for a while. That needs to be managed and can be managed as long as the benefits are there. Interestingly enough, many CxO still haven’t woken up to the biggest potential benefit but I almost never see mentioned. Too often people have blinders on. Whether this is due corporate inertia, political pressure or the tyranny of action driving them so seek out point solutions, it is a widely spread issue, a challenge we all face and one that will be with us for quite a while.

Anyway, the world is getting more and more mixed and diverse than ever and guiding organizations through that ever faster changing landscape is part of the job. Telling management what they need to hear over what they want to hear is part of that. Sometimes that works out well, sometimes not. The advice is always solid. But while I can lead the horse to water, I cannot make it drink. One is thing is for sure you cannot build standards, frameworks or platforms to success. Success cannot be bought either. Those are tools the manage past success more efficiently and hopefully more effectively.

Personal

My blog is still growing and, at well over 1 million views in 2017, according to my various WordPress statistics it’s doing better than ever in helping out the community at large. I’m quite happy about that. That’s what being a Microsoft MVP, a Veeam Vanguard, a Dell Community Rockstar and a Microsoft Extended Experts Team member is all about. Sharing and learning via blogging and community activities. I was able to present and attend various conferences to share experiences and knowledge. In return I got the opportunity to learn a lot and gain some insights into technical and market trends and developments. As part of those community activities I had more “practical” and “informal” strategy talks with various CIO/CTO role holders than before and that gave me a lot of insights in how businesses are dealing with IT anno 2017. It fluctuates between struggling and thriving, quite often in the same organization. My strength in those talks is that I don’t sell anything. No products, no company, not my time. It’s just a conversation based on insights, ideas and interests and community activity.

I’ve helped out a lot of colleagues & community buddies this year. That’s good. It helps look beyond the blinders of one own environment and that is always great learning experience. This helps to keep working interesting and fun while being productive. It’s a win-win situation.

Real Personal

Once in a while I raise toast to absent friends and fallen comrades. That list got a bit longer again this year as I lost my father. He had been through many personal health related challenges for many decades. That means I have taken more and longer care of my father than he did of me. That’s OK, one does what’s needed when and where needed even when one often feels that can never be enough. Life is not just about personal choices that are 100% under our own control, no matter how much people like to pretend it is. He doesn’t have to be afraid and in pain anymore. In that knowledge I have found solace. Keep in mind life is short and you don’t choose how you’ll end up. So, enjoy all your vacation days and learn to unwind once in a while. The world will keep turning despite you being out of office now and then.

Work

One thing was also a trend in 2017. More and more the recognition of my skills, value and abilities come from ever more diverse and globally dispersed directions. I can speculate on why that is, but I won’t. For now, I’m happy to know I’ve managed to stay employable. 2017 was also the year in which recruiters were more active than ever as the war for talent is on. The job market for skilled IT personnel is on fire it seems. A few recruiters did a good job by at least presenting themselves as having a clue, but many were literally so out of touch with reality it seemed that they can be replaced by AI at the current state of that technology in a heartbeat.

A few tips to all recruiters:

  • Read my Linkedin profile & blogs etc. It’s all out there. No, I’m not a Java programmer or a helpdesk jockey. The canned intros of “I read your profile and found you to be a great fit …” for a complete mismatch functions are way too often laughable and give me a very poor impression of the recruiter right from the start.
  • Give a good impression of the job role so I can evaluate quickly if it’s something I’d even consider. When that info is on the internet and you won’t share it, this is not making you look good.
  • Include a realistic minimum / maximum level of pay, secondary benefits, ability to telecommute … If that’s not even possible how can that role ever hold the status of an “opportunity of a live time” and why would I waste my time on figuring out if I’d be earning less, losing primary and/or secondary benefits. Be real people. Really, no one is looking to be worse off.
  • Don’t get mad at me if you contact me unsolicited and don’t get an answer when you don’t even have the courtesy to provide the above. I never asked you anything, remember that please. And if I do I hope I’ll be more to the point and direct I what I expect and need. I really dislike wasting my and your time.

Really, seriously, I mean this, the silly games that are way too often played are not worth my time. In a war for talent it seems this is a complete waste and indicates that the job market is (still) very much broken. Long commutes and fixed office hours that require long public transport commutes or being stuck in traffic, even in a salary or company car, are a waste of time and money. I don’t consider that a benefit. That being said, we all can hit bad times and I might need to take a pay cut, commute s long way, lose benefits etc. in my life time. But no one does this voluntarily. I have seen many of my peers grow into to better paid, interesting roles in 2017 on their own. Word of mouth & a network are key.

Looking forward to 2018

I have storage related projects, both on-premises and in the cloud or a mix of both in 2018. The combine high availability with data protection, scalability and accessibility in the best possible way I can find given the circumstances. I’m busy with strategies and mapping. By doing so I keep learning to help transition IT between the world of yesterday, today and tomorrow. Step by step. I also hope to attend and speak at 2 or 3 international conferences and learn even more form my peers.

So, goodbye 2017, welcome 2018! The new year is at our doorstep and we’ll let it in with a smile. I wish you all a great, fun, healthy and prosperous 2018.