Spectre and Meltdown

Introduction

While working on an Active Directory upgrade, a couple of backup repository replacements, a cluster hardware upgrade, a few hybrid cloud projects as well as a couple of all flash array migrations we get word about what we already suspected due to the massive maintenance announced in December in AWS and Azure around our IAAS. Spectre and Meltdown are upon us. I actually did all the maintenance proactively in Azure. So naturally I wondered about the hardware we own, clients, servers, load balancers, storage etc. All driven by CPUs. Darn that will wake you up in the morning. Now to put the doom and gloom in perspective you can read Meltdown and Spectre exploits: Cutting through the FUDand-spectre-exploits-cutting-through-the-fud.html which will help you breathe normally again if you were panicking before.

Getting to work on the problem

Whilst prepping our plan of action for deploying the patches, checking anti-virus compatibility, firmware and BIOS updates I do notice that the CISO / DPO are missing in action. They must have learned we are normally on top of such things and have gotten better at not going along with too much FUD. Bar all the FUD and the incredible number or roadside scholars in CPU design that appear on line to discuss CPU designs we’ve worked out a plan.

The info out there is a bit messy and distributed. But I have a clear picture now on what to do. I’m not going to rehash all the information out there but here are a couple of things I noticed that can cause issues or be confusing.

Microsoft mentioned the patch would not be installed on systems that don’t have the registry key set to indicate there is no compatibility issue with the anti-virus software (Important information regarding the Windows security updates released on January 3, 2018 and anti-virus software)

Note: Customers will not receive these security updates and will not be protected from security vulnerabilities unless their anti-virus software vendor sets the following registry key:

Key=”HKEY_LOCAL_MACHINE”Subkey=”SOFTWARE\Microsoft\Windows\CurrentVersion\QualityCompat”Value=”cadca5fe-87d3-4b96-b7fb-a231484277cc” Type=”REG_DWORD”

This seems not to be working that well. We saw the CU being downloaded and installs attempted on systems with Antivirus that did not have the registry key set. Shouldn’t happen right? We killed the update service to prevent the install, but luckily it seems our anti-virus solution is compatible( (see https://kc.mcafee.com/corporate/index?page=content&id=KB90167 – we have McAfee Enterprise 8.8 patch 10). Now the above article only mentions this registry setting for AV on Windows 2012 R2 / windows 8.1 or lower but the OS specific patches also mention this for Windows Server 2016 / Windows 10. See https://support.microsoft.com/en-us/help/4056890 (Windows Server 2016) for an example. Right …  One of my MVP colleagues also had to add this registry edit manually (or scripted) and need to download the latest Defender definitions for the update to be offered to him. So it seems not just 3rd party anti-virus. Your mileage may very a bit here it seems.

There is a great community effort on anti-virus compatibility info here https://docs.google.com/spreadsheets/d/184wcDt9I9TUNFFbsAVLpzAtckQxYiuirADzf3cL42FQ/htmlview?sle=true#gid=0

Anyway after a reboot you’ll see that the mitigations are active via a PowerShell module you can download.

Install-Module SpeculationControl

$SaveExecutionPolicy = Get-ExecutionPolicy

Set-ExecutionPolicy RemoteSigned -Scope Currentuser

Import-Module SpeculationControl

Get-SpeculationControlSettings

#When you’re done reset the execution policy …

Set-ExecutionPolicy $SaveExecutionPolicy -Scope Currentuser

Installing the patches is enough. You only need the extra keys mentioned in article Windows Server guidance to protect against speculative execution side-channel vulnerabilities on Windows 10 if you want to toggle between enabling them or disabling them. You do need to set them for Windows Server! These registry setting changes require a reboot to become effective.

#To disable the mitigations

reg add “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management” /v FeatureSettingsOverride /t REG_DWORD /d 3 /f

reg add “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management” /v FeatureSettingsOverrideMask /t REG_DWORD /d 3 /f

After a reboot you can see the mitigations are disabled. That allows you to work around issues these might incur. It’s better than uninstalling the update.

clip_image002

If you want to enable them for the first time on Windows Server or again after disabling it on Windows Server or Windows 10 do as follows:

#To enable the mitigations. Note that the FeatureSettingsOverrideMask value does remain 3, it’s not changed just left in place.

reg add “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management” /v FeatureSettingsOverride /t REG_DWORD /d 0 /f

reg add “HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Session Manager\Memory Management” /v FeatureSettingsOverrideMask /t REG_DWORD /d 3 /f

clip_image004

Do find the BIOS/Firmware updates you need from your OEMs and install them as well.If you don’t the output won’t be a nice and green as the above. I’m running tis on a DELL XPS 13 and that BIOS was patched a few days ago. Dell has been prepping for this before we ever knew about Spectre and Meltdown. Here’s an example: http://www.dell.com/support/home/be/nl/bedhs1/drivers/driversdetails?driverId=MXXTN

When waiting for the BIOS update or while you are testing the impact there is an other way to help protect your Hyper-V host while you’re at it to help mitigate VM to VM or VM to host attack: Alternative protections for Windows Server 2016 Hyper-V Hosts against the speculative execution side-channel vulnerabilities

Here is more guidance by Microsoft for both Hyper-V hosts, virtual machines and SQL Server installations.

This is not the end

This is not the end of the world, but it’s also not the end of the issue. CERT was very clear, probably due to Spectre, that the real fix was to replace the CPUs. So the heat is on and this might very well lead to an accelerated replacement of systems at the bigger cloud players and large hosters. They cant afford to lose 20 to 30 % of CPU performance. I have no insight on the impact they’re seeing, they might shared that in the future, but it could ruin their economies of scale. There’s always money to be made from a crisis right. But I have seen anything between 0 to marginal and 5-20% loss in testing, the latter with SQL workloads. It all depends on what the workloads is asking the CPU to do I guess and if that leverages speculative execution. It’s not going to that bad for most people. But for the cloud players who really max out their systems this might have a bigger impact and their risk profile (mutli-tenant, shared resources) + the potential rewards for hacking them makes it a  big concern and top priority for them. But we have done our homework and can mitigate the issue.

Replacing consumer devices and in all the IoT, sensors, domotics, cars, etc. out there is another matter. I’d be happy to see them all get patched in a reasonable time, which is also a pipe dream and that isn’t likely judging from the current state of things. Security is going to get a whole lot worse before it gets any better

And finally, hats of to the Google engineers who found the issues. Technically it’s quite fascinating even when explained at my level (I am not a CPU designer, so I need the Barney Bear version).

REFS Is Going Places At Microsoft Ignite 2015

I just loved the strengths of ReFS when I started looking into them when it was first announced. However it has been a bit quiet around ReFS due to some limitations or support issues.

But we’re seeing progress again and it seems that ReFS will be taken on a bigger role, even as the preferred file system for certain use cases. This is awesome. We’ll get the benefits REFS brings for less expensive types of disks in combination with storage spaces which are quite good already:

  • Integrity: ReFS stores data so that it is protected from many of the common errors that can cause data loss. File system metadata is always protected. Optionally, user data can be protected on a per-volume, per-directory, or per-file basis. If corruption occurs, ReFS can detect and, when configured with Storage Spaces, automatically correct the corruption. In the event of a system error, ReFS is designed to recover from that error rapidly, with no loss of user data.
  • Availability: ReFS is designed to prioritize the availability of data. With ReFS, if corruption occurs, and it cannot be repaired automatically, the online salvage process is localized to the area of corruption, requiring no volume down-time. In short, if corruption occurs, ReFS will stay online.
  • Scalability: ReFS is designed for the data set sizes of today and the data set sizes of tomorrow; it’s optimized for high scalability.
  • App Compatibility: To maximize AppCompat, ReFS supports a subset of NTFS features plus Win32 APIs that are widely adopted.
  • Proactive Error Identification: The integrity capabilities of ReFS are leveraged by a data integrity scanner (a “scrubber”) that periodically scans the volume, attempts to identify latent corruption, and then proactively triggers a repair of that corrupt data.

But, there’s more as ReFS has been improved and those improvements qualify it as the best default choice for a file system on Storage Spaces Direct (SSDi or D2S). What they have done to make it fast for certain data operations (VM creation, resizing, merges, snapshots) can only be described as “ODX” like. We’re getting speed, scalability, auto repair, high availability and data protection in a budget friendly storage. What’s not to like. There are many uses cases now where the need and benefits are clear but the economics worked against us. Well that’s about to be solved with the new and improved storage offerings in Windows Server 2016. I’m looking forward to more information on the evolution of ReFS as it matures as a file systems and take its place in on the front stage over the years. If you haven’t yet, I suggest you start looking at ReFS (again). I’ll be watching to see how far they take this.

SSL Certs And Achieving “A” Level Security With Older Windows Versions

So a mate of mine pings me. Says they have an problem with their web mail SSL security  (Exchange 2010) running virtualized on Hyper-V.  The security guy states they need to move to a more secure platform that supports “modern SSL standards” and proposes to migrate from Exchange 2010 to Exchange 2013 in an emergency upgrade. Preferably to VMware as “MickeySoft” is insecure. Oh boy! Another profit of disaster who says the ship is lost unless …

You immediately know that the “security guy” is an incompetent fraud who only reads the IT press tabloids, runs some  freely available vulnerability toys (some are quite good) to determine what to check off on his list and shout out some “the sky is falling” rubbish to justify his daily rate and guarantee his paycheck. I’ve said it before, your mother told you not to trust strangers just like that, so why do so many companies do this with “consultants”? Choose your advisers wisely and remember Machiavelli’s notes on the use of mercenaries Winking smile!

  • VMware is not more secure than Hyper-V. That’s so wrong and so loaded with prejudice it immediately invalidates the persons credibility & reputation. If you need proof, do your research but as a recent example the “HeartBleed” issue left VMware scrambling, not Hyper-V. And for what it’s worth. IT security is like crime, statistically we’ll all be victims a couple of times in our life time.
  • Exchange 2010 running on Windows 2008R2 fully patched is just fine. So what was all the drama about? The issue was that the Qualys SSL Labs tool gave their Outlook Web Access a F grade. Why? Well they still allowed SSL 2.0, they didn’t run TLS 1.2 and they don’t have Forward Secrecy support.

My advice to my buddy? First he needs to get better security advice. Secondly, to get an “A” for secure SSL configuration all you need to is some easy tweaking. You don’t want to support any clients that can’t handle the better SSL configurations anyway. No one should be allowed to use these anyway. But what do I use? SSL 3.0? TLS 1.0/1.1/1.2? What to use & do? Here’s some documentation on how to enable/disable protocols: How to restrict the use of certain cryptographic algorithms and protocols in Schannel.dll. This will tell you how to do it? But which SSL versions can you dump today without suffering to many support calls. Server side, drop SSL 2.0 & SSL 3.0, keep TLS 1.0/1.1/1.2. On the client side you’ll need to do the same. That will keep most things working. Not ideal but the trick is to allow / enable the better protocols server side so all clients that can use it, can use it, while you block the really bad ones that just don’t have any use any more. We’ll play a bit with this.

Test 1: Disable SSL 2.0 and Enable SSL 3.0

image

As you can see this gave them an B grade. We need to enforce the current best TLS 1.2 protocol to get that and we might want to get rid of SSL 3.0 as XP &n IE 6.0 have had there time and that’s over.

Test 2: Enable TLS 1.2

There you go. I hope this helps you out if you need to make sure you environment supports only more modern, stronger protocols.

image

There it is. A- Smile Compliance achieved! Now it would best to disable SSL 2.0/3.0, TLS 1.0/1.1 on the server and forget about any browsers, operating systems and software that can’t handle it. But that’s not that easily done you’ll need Outlook 2013 for RPC over HTTP if you want to enforce TLS 1.2. But as far as the auditors go they are all so happy now and effectively you’re now supporting the more modern clients. Now my buddy can get to an A or A+ rating when they make sure to get Forward secrecy support in the future. I really advise the latter as HeartBleed made it obvious the wide use of this is long overdue.

Some Testing Fun

Grab a laptop, WireShark and a number of twitter clients, cloud storage products and take a peak a what version of SSL/TLS those apps use. Some tests you can do:

MetroTwit uses SSLv3, OneDrive uses TLSv1, Yammer seems to be at TLSv1 as well. Try disabling TSL 1.0 on a client and see how it breaks Outlook  2010 RPC over HTTPS and even OneDrive by the way.

image

What you can get away with depends on the roles of the servers and the level security the clients for that role can handle.

Won’t this break functionality?

As you’ve seen above it can but for what matters on the e-mail server, probably not. If it does you’re in need of some major work on your client infrastructure. But in most cases you’ll be fine, especially with web browsers. But I have a underpaid employee who needs food stamp support so she cannot afford to upgrade her PC from Windows XP! Dude, pay a decent living wage, please. That aside, yes you can turn on better protocol support and block the oldest, most insecure ones on your servers. You call the shots on the use of your businesses infrastructure and you are under no obligation to allow your employees to access your services with obsolete clients. You want to be in the green zone, in the right column with TLS 1.2 if possible, but that’s going to be a challenge for a lot of services.

image

Do as I say, don’t do as I do

The funny thing is that I ran the same test against the web (mainly e-mail) servers of 4 governments levels that are enforcing/promoting the (mandatory) use of security officers in an attempt to get to a more secure web for the benefit of all man kind. Not only does this fail because of such fine examples of security officers but 2/3 don’t seem to take their own medicine. The intentions are good I’m sure but the road to hell is paved with those and while compliancy is not the same a being secure, even this is hard to get to it seems.

Federal Government Department

image

Undisclosed State Government

image

Undisclosed Local Government

image

Medium Sized City (they did well compared to the above braches with more resources)

image

Don’t panic

That’s what it says on the cover of “The Official Hitchhiker’s Guide to the Galaxy Companion”. Get some good advise and if you want or read more about how the rating is done (as of 2014) then please read this SSL Labs: Stricter Security Requirements for 2014 which also provide a link to their SSL Server Rating Guide.