Spectre and Meltdown

Introduction

While working on an Active Directory upgrade, a couple of backup repository replacements, a cluster hardware upgrade, a few hybrid cloud projects as well as a couple of all flash array migrations we get word about what we already suspected due to the massive maintenance announced in December in AWS and Azure around our IAAS. Spectre and Meltdown are upon us. I actually did all the maintenance proactively in Azure. So naturally I wondered about the hardware we own, clients, servers, load balancers, storage etc. All driven by CPUs. Darn that will wake you up in the morning. Now to put the doom and gloom in perspective you can read Meltdown and Spectre exploits: Cutting through the FUDand-spectre-exploits-cutting-through-the-fud.html which will help you breathe normally again if you were panicking before.

Getting to work on the problem

Whilst prepping our plan of action for deploying the patches, checking anti-virus compatibility, firmware and BIOS updates I do notice that the CISO / DPO are missing in action. They must have learned we are normally on top of such things and have gotten better at not going along with too much FUD. Bar all the FUD and the incredible number or roadside scholars in CPU design that appear on line to discuss CPU designs we’ve worked out a plan.

The info out there is a bit messy and distributed. But I have a clear picture now on what to do. I’m not going to rehash all the information out there but here are a couple of things I noticed that can cause issues or be confusing.

Microsoft mentioned the patch would not be installed on systems that don’t have the registry key set to indicate there is no compatibility issue with the anti-virus software (Important information regarding the Windows security updates released on January 3, 2018 and anti-virus software)

Note: Customers will not receive these security updates and will not be protected from security vulnerabilities unless their anti-virus software vendor sets the following registry key:

Key=”HKEY_LOCAL_MACHINE”Subkey=”SOFTWARE\Microsoft\Windows\CurrentVersion\QualityCompat”Value=”cadca5fe-87d3-4b96-b7fb-a231484277cc” Type=”REG_DWORD”

This seems not to be working that well. We saw the CU being downloaded and installs attempted on systems with Antivirus that did not have the registry key set. Shouldn’t happen right? We killed the update service to prevent the install, but luckily it seems our anti-virus solution is compatible( (see https://kc.mcafee.com/corporate/index?page=content&id=KB90167 – we have McAfee Enterprise 8.8 patch 10). Now the above article only mentions this registry setting for AV on Windows 2012 R2 / windows 8.1 or lower but the OS specific patches also mention this for Windows Server 2016 / Windows 10. See https://support.microsoft.com/en-us/help/4056890 (Windows Server 2016) for an example. Right …  One of my MVP colleagues also had to add this registry edit manually (or scripted) and need to download the latest Defender definitions for the update to be offered to him. So it seems not just 3rd party anti-virus. Your mileage may very a bit here it seems.

There is a great community effort on anti-virus compatibility info here https://docs.google.com/spreadsheets/d/184wcDt9I9TUNFFbsAVLpzAtckQxYiuirADzf3cL42FQ/htmlview?sle=true#gid=0

Anyway after a reboot you’ll see that the mitigations are active via a PowerShell module you can download.

Install-Module SpeculationControl

$SaveExecutionPolicy = Get-ExecutionPolicy

Set-ExecutionPolicy RemoteSigned -Scope Currentuser

Import-Module SpeculationControl

Get-SpeculationControlSettings

#When you’re done reset the execution policy …

Set-ExecutionPolicy $SaveExecutionPolicy -Scope Currentuser

Installing the patches is enough. You only need the extra keys mentioned in article Windows Server guidance to protect against speculative execution side-channel vulnerabilities on Windows 10 if you want to toggle between enabling them or disabling them. You do need to set them for Windows Server! These registry setting changes require a reboot to become effective.

#To disable the mitigations

reg add “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management” /v FeatureSettingsOverride /t REG_DWORD /d 3 /f

reg add “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management” /v FeatureSettingsOverrideMask /t REG_DWORD /d 3 /f

After a reboot you can see the mitigations are disabled. That allows you to work around issues these might incur. It’s better than uninstalling the update.

clip_image002

If you want to enable them for the first time on Windows Server or again after disabling it on Windows Server or Windows 10 do as follows:

#To enable the mitigations. Note that the FeatureSettingsOverrideMask value does remain 3, it’s not changed just left in place.

reg add “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management” /v FeatureSettingsOverride /t REG_DWORD /d 0 /f

reg add “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management” /v FeatureSettingsOverrideMask /t REG_DWORD /d 3 /f

clip_image004

Do find the BIOS/Firmware updates you need from your OEMs and install them as well.If you don’t the output won’t be a nice and green as the above. I’m running tis on a DELL XPS 13 and that BIOS was patched a few days ago. Dell has been prepping for this before we ever knew about Spectre and Meltdown. Here’s an example: http://www.dell.com/support/home/be/nl/bedhs1/drivers/driversdetails?driverId=MXXTN

When waiting for the BIOS update or while you are testing the impact there is an other way to help protect your Hyper-V host while you’re at it to help mitigate VM to VM or VM to host attack: Alternative protections for Windows Server 2016 Hyper-V Hosts against the speculative execution side-channel vulnerabilities

Here is more guidance by Microsoft for both Hyper-V hosts, virtual machines and SQL Server installations.

This is not the end

This is not the end of the world, but it’s also not the end of the issue. CERT was very clear, probably due to Spectre, that the real fix was to replace the CPUs. So the heat is on and this might very well lead to an accelerated replacement of systems at the bigger cloud players and large hosters. They cant afford to lose 20 to 30 % of CPU performance. I have no insight on the impact they’re seeing, they might shared that in the future, but it could ruin their economies of scale. There’s always money to be made from a crisis right. But I have seen anything between 0 to marginal and 5-20% loss in testing, the latter with SQL workloads. It all depends on what the workloads is asking the CPU to do I guess and if that leverages speculative execution. It’s not going to that bad for most people. But for the cloud players who really max out their systems this might have a bigger impact and their risk profile (mutli-tenant, shared resources) + the potential rewards for hacking them makes it a  big concern and top priority for them. But we have done our homework and can mitigate the issue.

Replacing consumer devices and in all the IoT, sensors, domotics, cars, etc. out there is another matter. I’d be happy to see them all get patched in a reasonable time, which is also a pipe dream and that isn’t likely judging from the current state of things. Security is going to get a whole lot worse before it gets any better

And finally, hats of to the Google engineers who found the issues. Technically it’s quite fascinating even when explained at my level (I am not a CPU designer, so I need the Barney Bear version).

Goodbye 2017, welcome 2018!

Professional

We’re back at work for a couple of day now in 2018. Another year flew by. In 2017 we’ve been busy with letting Windows Server 2016 and Azure based solutions shine across the entire stack of compute, storage and networking. This allow various services and applications to deliver the best possible results on-premises and in the (hybrid) cloud / IAAS. In other words, pretty much what you’d expect a Microsoft Cloud & Datacenter MVP to be doing. The amount of time and money saved by having context, a goal and a plan to evolve your IT environment incrementally and keep it current cannot be ignored. It is significant.

Cloud wise, that word seems to have become whatever vendors / consultants / businesses want it to mean. Very convenient. There are (too) many ways in which cloud solutions are defined the various forms of cloud in an ever-bigger set of permutations. In the end it’s about the best possible solution for the needs at hand at the time you make the decision. Things change fast and we just try to make reasonable future proofed decisions. Sometimes that will be pure “serverless” such as Azure Function. Sometimes it is still very much some form of on-premises virtualization and anything in between. It doesn’t matter. All you need to do is support the organization to deliver what’s needed cost effectively and efficiently. In reality today, that means a lot of options are available. As long as you have a plan, don’t paint your business into a corner in any way and can manage the governance of what you’re doing it’s not too shabby. Especially not when compared too so many that are just being sold to and are just “doing” stuff in a copy / paste effort to avoid looking like they’re not on board with the buzz of “digital transformation”.

When I wear my CTO but perhaps even more the CEO/CIO/CFO role I really do like “serverless”. The big risk to manage today is the utter and gigantic lock-in this is and will remain for a while. That needs to be managed and can be managed as long as the benefits are there. Interestingly enough, many CxO still haven’t woken up to the biggest potential benefit but I almost never see mentioned. Too often people have blinders on. Whether this is due corporate inertia, political pressure or the tyranny of action driving them so seek out point solutions, it is a widely spread issue, a challenge we all face and one that will be with us for quite a while.

Anyway, the world is getting more and more mixed and diverse than ever and guiding organizations through that ever faster changing landscape is part of the job. Telling management what they need to hear over what they want to hear is part of that. Sometimes that works out well, sometimes not. The advice is always solid. But while I can lead the horse to water, I cannot make it drink. One is thing is for sure you cannot build standards, frameworks or platforms to success. Success cannot be bought either. Those are tools the manage past success more efficiently and hopefully more effectively.

Personal

My blog is still growing and, at well over 1 million views in 2017, according to my various WordPress statistics it’s doing better than ever in helping out the community at large. I’m quite happy about that. That’s what being a Microsoft MVP, a Veeam Vanguard, a Dell Community Rockstar and a Microsoft Extended Experts Team member is all about. Sharing and learning via blogging and community activities. I was able to present and attend various conferences to share experiences and knowledge. In return I got the opportunity to learn a lot and gain some insights into technical and market trends and developments. As part of those community activities I had more “practical” and “informal” strategy talks with various CIO/CTO role holders than before and that gave me a lot of insights in how businesses are dealing with IT anno 2017. It fluctuates between struggling and thriving, quite often in the same organization. My strength in those talks is that I don’t sell anything. No products, no company, not my time. It’s just a conversation based on insights, ideas and interests and community activity.

I’ve helped out a lot of colleagues & community buddies this year. That’s good. It helps look beyond the blinders of one own environment and that is always great learning experience. This helps to keep working interesting and fun while being productive. It’s a win-win situation.

Real Personal

Once in a while I raise toast to absent friends and fallen comrades. That list got a bit longer again this year as I lost my father. He had been through many personal health related challenges for many decades. That means I have taken more and longer care of my father than he did of me. That’s OK, one does what’s needed when and where needed even when one often feels that can never be enough. Life is not just about personal choices that are 100% under our own control, no matter how much people like to pretend it is. He doesn’t have to be afraid and in pain anymore. In that knowledge I have found solace. Keep in mind life is short and you don’t choose how you’ll end up. So, enjoy all your vacation days and learn to unwind once in a while. The world will keep turning despite you being out of office now and then.

Work

One thing was also a trend in 2017. More and more the recognition of my skills, value and abilities come from ever more diverse and globally dispersed directions. I can speculate on why that is, but I won’t. For now, I’m happy to know I’ve managed to stay employable. 2017 was also the year in which recruiters were more active than ever as the war for talent is on. The job market for skilled IT personnel is on fire it seems. A few recruiters did a good job by at least presenting themselves as having a clue, but many were literally so out of touch with reality it seemed that they can be replaced by AI at the current state of that technology in a heartbeat.

A few tips to all recruiters:

  • Read my Linkedin profile & blogs etc. It’s all out there. No, I’m not a Java programmer or a helpdesk jockey. The canned intros of “I read your profile and found you to be a great fit …” for a complete mismatch functions are way too often laughable and give me a very poor impression of the recruiter right from the start.
  • Give a good impression of the job role so I can evaluate quickly if it’s something I’d even consider. When that info is on the internet and you won’t share it, this is not making you look good.
  • Include a realistic minimum / maximum level of pay, secondary benefits, ability to telecommute … If that’s not even possible how can that role ever hold the status of an “opportunity of a live time” and why would I waste my time on figuring out if I’d be earning less, losing primary and/or secondary benefits. Be real people. Really, no one is looking to be worse off.
  • Don’t get mad at me if you contact me unsolicited and don’t get an answer when you don’t even have the courtesy to provide the above. I never asked you anything, remember that please. And if I do I hope I’ll be more to the point and direct I what I expect and need. I really dislike wasting my and your time.

Really, seriously, I mean this, the silly games that are way too often played are not worth my time. In a war for talent it seems this is a complete waste and indicates that the job market is (still) very much broken. Long commutes and fixed office hours that require long public transport commutes or being stuck in traffic, even in a salary or company car, are a waste of time and money. I don’t consider that a benefit. That being said, we all can hit bad times and I might need to take a pay cut, commute s long way, lose benefits etc. in my life time. But no one does this voluntarily. I have seen many of my peers grow into to better paid, interesting roles in 2017 on their own. Word of mouth & a network are key.

Looking forward to 2018

I have storage related projects, both on-premises and in the cloud or a mix of both in 2018. The combine high availability with data protection, scalability and accessibility in the best possible way I can find given the circumstances. I’m busy with strategies and mapping. By doing so I keep learning to help transition IT between the world of yesterday, today and tomorrow. Step by step. I also hope to attend and speak at 2 or 3 international conferences and learn even more form my peers.

So, goodbye 2017, welcome 2018! The new year is at our doorstep and we’ll let it in with a smile. I wish you all a great, fun, healthy and prosperous 2018.

Veeam Availability Suite 9.5 Update 3 Released

Veeam Availability Suite 9.5 Update 3 was released just in time to put under the X-Mas tree of you IT staff. It’s a major release and they’ve gone above and beyond what’’ you’d expect from an update. Also read this post to see all the goodness it offers. It’s clear Veeam has the intent to become the backup provider for even more environments whether these are physical, virtual or containers and whether these are on-premises or in private, hybrid or public clouds. Read up on some details in this blog post by Rick Vanover on the Veeam web site.

image

As a Veeam Vanguard I did not waste time and downloaded the updates to test the deployment in the lab and that went fine. From there it went to the proving grounds (real life hardware & labs) before deploying it in production. All of these deployments went super smooth.

I’m happy to report Veeam Backup & Replication 9.5 and Veeam One 9.5 Update 3 is running very well in production and I have not seen any issues yet. In case you’re wondering or nervous about it the SAN deployments with Off Host Proxies didn’t miss a beat. That’s the quality and smooth experience of Veeam that we have come to rely on. I’m still impressed, even after many updates, with how smooth it goes and how streamlined the update process is.

Two more things. You only have a few days left to enter for a chance to win a fully paid trip to a VeeamON 2018 Event in your region. So act now if you’d like to attend for free. You have 2 days left!

image

It’s also still possible to nominate your self or some one else for the Veeam Vanguard program. Don’t delay! You have until December 29th!

Veeam Vanguard Program

 

That’s it. Time for some down time for the end of year & new year festivities.

Hyper V Amigos Showcast 15 – VMFleet explained

After a short break in broadcasts, due to extremely busy times, the Hyper-V Amigos ride again! Yes, fellow Microsoft MVP Carsten Rachfahl (@hypervserver) from Rachfahl IT Solutions and yours truly are back in the saddle!

In episode 15 of the Hyper-V Amigos Showcast we look at VMFleet. VMFleet is a free Microsoft test suite to show/test and help analyze the performance of a Hyper-converged Storage Spaces Direct Setup.

image

In this episode Hyper V Amigos Showcast 15 – VMFleet explained we show you how to set it up, how it works and how to measure and evaluate your S2D deployment with it. We also share some real life tips with you. All this in just around 2 hours. So stick around for the long haul!