Quick Assist: CredSSP encryption oracle remediation Error

In the past 12 hours I’ve seen the first mentions of people no longer being able to connect over RDP via a RD Gateway to their clients or servers. I also got a call to ask for help with such an issue. The moment I saw the error message it rang home that this was a known and documented issue with CredSSP encryption oracle remediation, which is both preventable and fixable.

The person trying to connect over RD Gateway get the following message:
[Window Title]
Remote Desktop Connection
[Content]
An authentication error has occurred.
The function requested is not supported
Remote computer: target.domain.com
This could be due to CredSSP encryption oracle remediation.
For more information, see
https://go.microsoft.com/fwlink/?linkid=866660
[OK]

image

Follow that link and it will tell you all you need to know to fix it and how to avoid it.
A remote code execution vulnerability (CVE-2018-0886) exists in unpatched versions of CredSSP. This issue was addressed by correcting how CredSSP validates requests during the authentication process.

The initial March 13, 2018, release updates the CredSSP authentication protocol and the Remote Desktop clients for all affected platforms.
Mitigation consists of installing the update on all eligible client and server operating systems and then using included Group Policy settings or registry-based equivalents to manage the setting options on the client and server computers. We recommend that administrators apply the policy and set it to  “Force updated clients” or “Mitigated” on client and server computers as soon as possible.  These changes will require a reboot of the affected systems. Pay close attention to Group Policy or registry settings pairs that result in “Blocked” interactions between clients and servers in the compatibility table later in this article.

April 17, 2018:
The Remote Desktop Client (RDP) update update in KB 4093120 will enhance the error message that is presented when an updated client fails to connect to a server that has not been updated.

May 8, 2018:
An update to change the default setting from Vulnerable to Mitigated (KB4103723 for W2K16 servers) and KB4103727 for Windows 10 clients. Don’t forget the vulnerability also exists for W2K12(R2) and lower as well as equivalent clients.

The key here is that with the May updates change the default for the new policy setting changes the default setting from to mitigated.

Microsoft is releasing new Windows security updates to address this CVE on May 8, 2018. The updates released in March did not enforce the new version of the Credential Security Support Provider protocol. These security updates do make the new version mandatory. For more information see “CredSSP updates for CVE-2018-0886” located at https://support.microsoft.com/en-us/help/4093492.

This can result in mismatches between systems at different patch levels. Which is why it’s now more of a wide spread issue. Looking at the table in the article and the documented errors it’s clear enough there was a mismatch. It was also clear how to fix it. Patch all systems and make sure the settings are consistent. Use GPO or edit the registry settings to do so. Automation is key here. Uninstalling the patch works but is not a good idea. This vulnerability is serious.

image

Now Microsoft did warn about this change. You can even read about it on the PFE blog https://blogs.technet.microsoft.com/askpfeplat/tag/encryption-oracle-remediation/. Nevertheless, many people seem to have been bitten by this one. I know it’s hard to keep up with everything that is moving at the speed of light in IT but this is one I was on top of. This is due to the fact that the fix is for a remote vulnerability in RDS. That’s a big deal and not one I was willing let slide. You need to roll out the updates and you need to configure your policy and make sure you’re secured. The alternative (rolling back the updates, allowing vulnerable connections) is not acceptable, be vulnerable to a known and fixable exploit. TAKE YOUR MEDICIN!  Read the links above for detailed guidance on how to do this. Set your policy on both sides to mitigated. You don’t need to force updated clients to fix the issue this way and you can patch your servers 1st followed by your clients. Do note the tips given on doing this in the PFE blog:

Note: Ensure that you update the Group Policy Central Store (Or if not using a Central Store, use a device with the patch applied when editing Group Policy) with the latest CredSSP.admx and CredSSP.adml. These files will contain the latest copy of the edit configuration settings for these settings, as seen below.

Registry
Path: HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\System\CredSSP\Parameters
Value: AllowEncryptionOracle
Date type: DWORD
Reboot required: Yes

Here’s are the registry settings you need to make sure connectivity is restored

Everything patched: 0 => when all is patched including 3rd party CredSSP clients you can use “Force updated clients”
server patched but not all clients: 1 =>use “mitigated”, you’ll be as secure as possible without blocking people. Alternatively you can use 2 (“vulnerable”) but avoid that if possible  as it is more risky, so I would avoid that.

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System\CredSSP][HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System\CredSSP\Parameters]
“AllowEncryptionOracle”=dword:00000001

So, check your clients and servers, both on-premises and in the cloud to make sure you’re protected and have as little RDS connectivity issues as possible. Don’t forget about 3rd party clients that need updates to if you have those! Don’t panic and carry on.

PowerShell Script to Load Balance DNS Server Search Order

Load Balance DNS Server Search Order

DNS servers need to be configured correctly, operate perfectly and respond as fast as possible to their clients. For some applications this is critical, but many have a more relaxed attitude. Hence a DNS Server has a full second to respond to a query. That means that even when you have 2 DNS servers configured on the clients the second will only be used when the first is not available or doesn’t respond quickly enough. This has a side effect which is that moving traffic away from an overloaded DNS servers isn’t that easy or optimal. We’ll look at when to use a PowerShell script to Load balance DNS server search order.

DHCP now and then

The trick here is to balance the possible DNS servers search order amongst the clients. We used to do this via split scopes and use different DNS servers search orders in each scope. When we got Windows server 2012(R2) we not only gained policies to take care of this but also DHCP failover with replica. That’s awesome as it relieves us of much of the tedious work of keeping track of maintaining split scopes and different options on all DCHP servers involved. For more information in using the MAC addresses and DCHP policies to load balance the use of your DNS servers read this TechNet article Load balancing DNS servers using DHCP Server Policies.

Fixed IP configurations

But what about servers with fixed IP addresses? Indeed, the dream world where we’ll see dynamically assigned IP configuration everywhere is a good one but perfection is not of this world. Fixed IP configurations are still very common and often for good reasons. Some turn to DCHP reservations to achieve this but many go for static IP configuration on the servers.

image

When that’s the case, our sys admins are told the DNS servers to use. Most of the time they’ll enter those in the same order over and over again, whether they do this manual or automated. So that means that the first and second DNS server in the search order are the same everywhere. No load balancing to be found. So potentially one DNS server is doing all the work and getting slower at it while the second or third DNS servers in the search order only help out when the first one is down or doesn’t respond quickly enough anymore. Not good. When you consider many (most?) used AD integrated DNS for their MSFT environments that’s even less good.

PowerShell Script to Load Balance DNS Server Search Order

That’s why when replacing DNS Servers or seeing response time issues on AD/DNS servers I balance the DNS server search order list. I do this based on their IP address its last octet. If that’s even, DNS Server A is the first in the search order and if not it’s DNS Server B that goes in first. That mixes them up pseudo random enough.

I use a PowerShell script for that nowadays instead of my age-old VBScript one. But recently I wanted to update it to no longer use WMI calls to get the job done. That’s the script I’m sharing here, or at least the core cons pet part of it, you’ll need to turn it into a module and parameterize if further to suit your needs. The main idea is here offering an alternative to WMI calls. Do note you’ll need PowerShell remoting enabled and configured and have the more recent Windows OS versions (Windows Server 2012 and up).

 

Microsoft Security Bulletin MS16-045

Just a quick post to make sure you all know there’s an important security update for Hyper-V in the April 2016 batch of updates.

Please review Microsoft Knowledge Base Article 3143118  and Microsoft Security Bulletin MS16-045 – Important for details. Realize thatthis ios one you’d better test en deploy asap. In my deployments I have not seen or heard o any issues with the update so far.

Why this little shout out? Well it’s a remote code execution vulnerability that can leverage the guest to run code on the host.

This security update resolves vulnerabilities in Microsoft Windows. The most severe of the vulnerabilities could allow remote code execution if an authenticated attacker on a guest operating system runs a specially crafted application that causes the Hyper-V host operating system to execute arbitrary code. Customers who have not enabled the Hyper-V role are not affected.

It affect Windows 8.1 (x64), Windows Server 2012, Windows Server 2012 R2, and Windows 10 (x64). Test and patch a.s.a.p. When you’re a hosting provider, I hope you’re already on top of this one.

 

Recover From Expanding VHD or VDHX Files On VMs With Checkpoints

So you’ve expanded the virtual disk (VHD/VHDX) of a virtual machine that has checkpoints (or snapshots as they used to be called) on it. Did you forget about them?  Did you really leave them lingering around for that long?  Bad practice and not supported (we don’t have production snapshots yet, that’s for Windows Server 2016). Anyway your virtual machine won’t boot. Depending on the importance of that VM you might be chewed out big time or ridiculed. But what if you don’t have a restore that works? Suddenly it’s might have become a resume generating event.

All does not have to be lost. Their might be hope if you didn’t panic and made even more bad decisions. Please, if you’re unsure what to do, call an expert, a real one, or at least some one who knows real experts. It also helps if you have spare disk space, the fast sort if possible and a Hyper-V node where you can work without risk. We’ll walk you through the scenarios for both a VHDX and a VHD.

How did you get into this pickle?

If you go to the Edit Virtual Hard Disk Wizard via the VM settings it won’t allow for that if the VM has checkpoints, whether the VM is online or not.

image

VHDs cannot be expanded on line. If the VM had checkpoints it must have been shut down when you expanded the VHD. If you went to the Edit Disk tool in Hyper-V Manager directly to open up the disk you don’t get a warning. It’s treated as a virtual disk that’s not in use. Same deal if you do it in PowerShell

Resize-VHD -Path “C:\ClusterStorage\Volume2\DidierTest06\Virtual Hard Disks\RuinFixedVHD.vhd” -SizeBytes 15GB

That just works.

VHDXs can be expanded on online if they’re attached to a vSCSI controller. But if the VM has checkpoints it will not allow for expanding.

image

So yes, you deliberately shut it down to be able to do it with the the Edit Disk tool in Hyper-V Manager. I know, the warning message was not specific enough but consider this. The Edit disk tool when launched directly has no idea of what the disk you’re opening is used for, only if it’s online / locked.

Anyway the result is the same for the VM whether it was a VHD or a VHDX. An error when you start it up.

[Window Title]
Hyper-V Manager

[Main Instruction]
An error occurred while attempting to start the selected virtual machine(s).

[Content]
‘DidierTest06’ failed to start.

Synthetic SCSI Controller (Instance ID 92ABA591-75A7-47B3-A078-050E757B769A): Failed to Power on with Error ‘The chain of virtual hard disks is corrupted. There is a mismatch in the virtual sizes of the parent virtual hard disk and differencing disk.’.

Virtual disk ‘C:\ClusterStorage\Volume2\DidierTest06\Virtual Hard Disks\RuinFixedVHD_8DFF476F-7A41-4E4D-B41F-C639478E3537.avhd’ failed to open because a problem occurred when attempting to open a virtual disk in the differencing chain, ‘C:\ClusterStorage\Volume2\DidierTest06\Virtual Hard Disks\RuinFixedVHD.vhd’: ‘The size of the virtual hard disk is not valid.’.

image

You might want to delete the checkpoint but the merge will only succeed for the virtual disk that have not been expanded.  You actually don’t need to do this now, it’s better if you don’t, it saves you some stress and extra work. You could remove the expanded virtual disks from the VM. It will boot but in many cased the missing data on those disks are very bad news. But al least you’ve proven the root cause of your problems.

If you inspect the AVVHD/AVHDX file you’ll get an error that states

The differencing virtual disk chain is broken. Please reconnect the child to the correct parent virtual hard disk.

image

However attempting to do so will fail in this case.

Failed to set new parent for the virtual disk.

The Hyper-V Virtual Machine Management service encountered an unexpected error: The chain of virtual hard disks is corrupted. There is a mismatch in the virtual sizes of the parent virtual hard disk and differencing disk. (0xC03A0017).

image

Is there a fix?

Let’s say you don’t have a backup (shame on you). So now what? Make copies of the VHDX/AVHDX or VHD/AVHD and save guard those. You can also work on copies or on the original files.I’ll just the originals as this blog post is already way too long. If you. Note that some extra disk space and speed come in very handy now. You might even copy them of to a lab server. Takes more time but at least you’re not working on a production host than.

Working on the original virtual disk files (VHD/AVHD and / or VHDX/AVHDX)

If you know the original size of the VHDX before you expanded it you can shrink it to exactly that. If you don’t there’s PowerShell to the rescue if you want to find out the minimum size.

image

But even better you can shrink it to it’s minimum size, it’s a parameter!

Resize-VHD -Path “C:\ClusterStorage\Volume2\DidierTest06\Virtual Hard Disks\RuinFixedVHD.vhd” -ToMinimumSize

Now you not home yet. If you restart the VM right now it will fail … with the following error:

‘DidierTest06’ failed to start. (Virtual machine ID 7A54E4DB-7CCB-42A6-8917-50A05354634F)

‘DidierTest06’ Synthetic SCSI Controller (Instance ID 92ABA591-75A7-47B3-A078-050E757B769A): Failed to Power on with Error ‘The chain of virtual hard disks is corrupted. There is a mismatch in the identifiers of the parent virtual hard disk and differencing disk.’ (0xC03A000E). (Virtual machine ID 7A54E4DB-7CCB-42A6-8917-50A05354634F)

image

What you need to do is reconnect the AVHDX to it’s parent and choose to ignore the ID mismatch. You can do this via Edit Disk in Hyper-V Manager of in PowerShell. For more information on manually merging & repairing checkpoints see my blogs on this subject here. In this post I’ll just show the screenshots as walk through.

image

image

image

image

image

Once that’s done you’re VHDX is good to go.

For a VHD you can’t shrink that with the inbox tools. There is however a free command line tool that can do that names VHDTool.exe. The original is hard to find on the web so here is the installer if you need it. You only need the executable, which is portable actually, don’t install this on a production server. It has a repair switch to deal with just this occurrence!

Here’s an example of my lab …

D:\SysAdmin>VhdTool.exe /repair “C:\ClusterStorage\Volume2\DidierTest06\Virtual Hard Disks\RuinFixedVHD.vhd” “C:\ClusterStorage\Volume2\DidierTest06\Virtual Hard Disks\RuinFixedVHD_8DFF476F-7A41-4E4D-B41F-C639478E3537.avhd”

image

That’s it for the VHD …

You’re back in business!  All that’s left to do is get rid of the checkpoints. So you delete them. If you wanted to apply them an get rid of the delta, you could have just removed the disks, re-added the VHD/VHDX and be done with it actually. But in most of these scenarios you want to keep the delta as you most probably didn’t even realize you still had checkpoints around. Zero data loss Winking smile.

Conclusion

Save your self the stress, hassle and possibly expense of hiring an expert.  How? Please do not expand a VHD or VHDX of a virtual machine that has checkpoints. It will cause boot issues with the expanded virtual disk or disks! You will be in a stressful, painful pickle where you might not get out of if you make the wrong decisions and choices!

As a closing note, you must have have backups and restores that you have tested. Do not rely on your smarts and creativity or that others, let alone luck. Luck runs out. Otions run out. Even for the best and luckiest of us. VEEAM has save my proverbial behind a few times already.