MFA for a highly available RD Gateway

MFA for a highly available RD Gateway

Recently I decided to write up a couple of articles on how to set up MFA for a highly available RD Gateway. Why? Because so much information on the internet is fragmented and as such incomplete. So I wanted a reference document for myself. As I was making that document I realized I needed to explain the why and not just the how. The “why” is what helps people support and troubleshoot the solution during its life cycle.

The above, in combination with me being a verbose son of * led to 44 pages of information. So, I decided to publish it as a two-part article series.

MFA for a highly available RD Gateway
Figure 1: MFA for a highly available RD Gateway

You can find the articles here Transition a Highly Available RD Gateway to Use the NPS Extension for Azure MFA – Phase I and Transition a highly available RD Gateway to use the NPS Extension for Azure MFA – Phase II

Why and when should you read them?

If you have RD Gateway running and you have no MFA solution set up for it, I highly recommend you head over to read these two articles. That is especially true when your RD Gateways solution is a high availability (HA) deployment with an RD Gateway farm behind a load balancer. In that case, you want your MFA components to be HA as well! For some reason, so many guides on the internet ignore or brush over HA very cavalierly. That is one thing I hope these two articles remediate.

Next to that, it has many details on every aspect of the deployment to make sure you get it up and running successfully and correctly.

Finally, I present you with a collection of troubleshooting information and tools to help you figure out where the problem is so you can find a way to fix it.

That’s it. I really think it can help many of you out there. I hope it does.

Quick Assist: CredSSP encryption oracle remediation Error

In the past 12 hours I’ve seen the first mentions of people no longer being able to connect over RDP via a RD Gateway to their clients or servers. I also got a call to ask for help with such an issue. The moment I saw the error message it rang home that this was a known and documented issue with CredSSP encryption oracle remediation, which is both preventable and fixable.

The person trying to connect over RD Gateway get the following message:
[Window Title]
Remote Desktop Connection
[Content]
An authentication error has occurred.
The function requested is not supported
Remote computer: target.domain.com
This could be due to CredSSP encryption oracle remediation.
For more information, see
https://go.microsoft.com/fwlink/?linkid=866660
[OK]

image

Follow that link and it will tell you all you need to know to fix it and how to avoid it.
A remote code execution vulnerability (CVE-2018-0886) exists in unpatched versions of CredSSP. This issue was addressed by correcting how CredSSP validates requests during the authentication process.

The initial March 13, 2018, release updates the CredSSP authentication protocol and the Remote Desktop clients for all affected platforms.
Mitigation consists of installing the update on all eligible client and server operating systems and then using included Group Policy settings or registry-based equivalents to manage the setting options on the client and server computers. We recommend that administrators apply the policy and set it to  “Force updated clients” or “Mitigated” on client and server computers as soon as possible.  These changes will require a reboot of the affected systems. Pay close attention to Group Policy or registry settings pairs that result in “Blocked” interactions between clients and servers in the compatibility table later in this article.

April 17, 2018:
The Remote Desktop Client (RDP) update update in KB 4093120 will enhance the error message that is presented when an updated client fails to connect to a server that has not been updated.

May 8, 2018:
An update to change the default setting from Vulnerable to Mitigated (KB4103723 for W2K16 servers) and KB4103727 for Windows 10 clients. Don’t forget the vulnerability also exists for W2K12(R2) and lower as well as equivalent clients.

The key here is that with the May updates change the default for the new policy setting changes the default setting from to mitigated.

Microsoft is releasing new Windows security updates to address this CVE on May 8, 2018. The updates released in March did not enforce the new version of the Credential Security Support Provider protocol. These security updates do make the new version mandatory. For more information see “CredSSP updates for CVE-2018-0886” located at https://support.microsoft.com/en-us/help/4093492.

This can result in mismatches between systems at different patch levels. Which is why it’s now more of a wide spread issue. Looking at the table in the article and the documented errors it’s clear enough there was a mismatch. It was also clear how to fix it. Patch all systems and make sure the settings are consistent. Use GPO or edit the registry settings to do so. Automation is key here. Uninstalling the patch works but is not a good idea. This vulnerability is serious.

image

Now Microsoft did warn about this change. You can even read about it on the PFE blog https://blogs.technet.microsoft.com/askpfeplat/tag/encryption-oracle-remediation/. Nevertheless, many people seem to have been bitten by this one. I know it’s hard to keep up with everything that is moving at the speed of light in IT but this is one I was on top of. This is due to the fact that the fix is for a remote vulnerability in RDS. That’s a big deal and not one I was willing let slide. You need to roll out the updates and you need to configure your policy and make sure you’re secured. The alternative (rolling back the updates, allowing vulnerable connections) is not acceptable, be vulnerable to a known and fixable exploit. TAKE YOUR MEDICIN!  Read the links above for detailed guidance on how to do this. Set your policy on both sides to mitigated. You don’t need to force updated clients to fix the issue this way and you can patch your servers 1st followed by your clients. Do note the tips given on doing this in the PFE blog:

Note: Ensure that you update the Group Policy Central Store (Or if not using a Central Store, use a device with the patch applied when editing Group Policy) with the latest CredSSP.admx and CredSSP.adml. These files will contain the latest copy of the edit configuration settings for these settings, as seen below.

Registry
Path: HKLM\Software\Microsoft\Windows\CurrentVersion\Policies\System\CredSSP\Parameters
Value: AllowEncryptionOracle
Date type: DWORD
Reboot required: Yes

Here’s are the registry settings you need to make sure connectivity is restored

Everything patched: 0 => when all is patched including 3rd party CredSSP clients you can use “Force updated clients”
server patched but not all clients: 1 =>use “mitigated”, you’ll be as secure as possible without blocking people. Alternatively you can use 2 (“vulnerable”) but avoid that if possible  as it is more risky, so I would avoid that.

[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System\CredSSP][HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System\CredSSP\Parameters]
“AllowEncryptionOracle”=dword:00000001

So, check your clients and servers, both on-premises and in the cloud to make sure you’re protected and have as little RDS connectivity issues as possible. Don’t forget about 3rd party clients that need updates to if you have those! Don’t panic and carry on.

Import of RD Gateway configuration file with policies referencing local resources wipes all policies clean!

Introduction

When you have Windows Server 2016 RD Gateway server and you expect to be able to import a configuration XML file you’ll might find yourself in a pickle when you are also using local resources. Because the import of RD Gateway configuration file with policies referencing local resources wipes all policies clean! With local resources I mean local user accounts and groups. These are leveraged more than I imagined at first.

When does it happen?

In the past I have blogged about migrating RD Gateway servers that contain policies referencing local resources here: Fixing Event ID 2002 “The policy and configuration settings could not be imported to the RD Gateway server “%1” because they are associated with local computer groups on another RD Gateway server”.

We used to be able to use the trick of making sure the local resources exist on the new server (either by recreating them there via the server migration wizard or manually) and changing the server name in the exported configuration XML file  to successfully import the configuration. That no longer works. You get an error.Import of RD Gateway configuration file with policies referencing local resources wipes all policies clean!

As far as migrations go from older versions, they work fins as long as you don’t have policies with local resources. Otherwise you’d better do an in place upgrade or recreate the resources & policies on the new servers. The method described in my blog is not working any more. That’s to bad. But it gets worse.

Import of RD Gateway configuration file with policies referencing local resources wipes all policies clean!

As said,it doesn’t end there. The issue is there even when you try to import the configuration on to the same server you exported it from.That’s really bad as it a quick way to protect against any mistakes you might make, and allows to get back to the original configuration.

What’s even worse, when the import fails it wipes ALL the policies in the RD Gateway Server => dangerous! So yes, the import of RD Gateway configuration file with policies referencing local resources wipes all policies clean!

Precautions

Only a backup or a checkpoint can save your then (or recreate the all manually)! Again this is only when the exported configuration file references local resources! The fasted way to clean out an RD Gateway configuration on Windows Server 2016 is actually importing a configuration export which contains a policy referring to local resource. Ouch! I’m not aware of a fix up to this date.

For now you only protection is a checkpoint or a backup. Depending on where and how you source your virtual machines you might not have access to a checkpoint.

You have been warned, be careful.

Changes in RDP over UDP behavior in Windows 10 and Windows 2016

Introduction

With Windows Server 2012 and Windows 8 (and Windows 7 RDP client 8.0) with some updates we got support for RDP to use UDP for data transport. This gave us a great experience over less reliable to even rather bad networks.

Anecdote: I was in an area of the world where there was no internet access available bar a very bad and lousy Wi-Fi connection at the shop/cafeteria. That was just fine, I wasn’t there for the great Wi-Fi access at all. But I needed to check e-mail and that wasn’t succeeding in any way, the network reliability was just too bad. I got the job done by using RDP to connect to a workstation back home (across the ocean on another continent) and check my e-mail there. Not a super great experience but UDP made it possible where nothing else worked. I was impressed.

Changes in RDP over UDP behavior in Windows 10 and Windows 2016

When connecting to Windows Server 2016 or a Windows 10 over a RD Gateway we see 1 HTTP and only one UDP connection being established for a session. We used to see 1 HTTP and 2 UDP connections per session with Windows 8/8.1 and Windows Server 2012(R2)

It doesn’t matter if your client is running RDP 8.0 or RDP 10.0 or whether the RD Gateway itself is running Windows Server 2012 R2 or Windows Server 2016. The only thing that does matter is the target that you are connecting to.

Also, this has nothing to do with a Firewall or so acting up, we’re testing with and without with the same IP etc. Let’s take a quick look at some examples and compare.

When connecting to Windows 10 or Windows Server 2016 we see that 1 UDP connection is established.

In total, there are 8 events logged for a successful connection over the RDG Gateway.

clip_image002

You’ll find 2 event ID 302 events (1 for a HTTP connection and 1 for a UDP connection) as well as 1 Event ID 205 events for the UDP proxy usage.

clip_image003

clip_image004

In the RD Gateway manager, monitoring we can see 1 HTTP and the 1 UDP connections for one RDP Session to a Windows 2016 Server.

clip_image006

When connecting to Windows 8/8.1 or Windows Server 2012 (R2) we see that 2 UDP connections are established.

In total, there are 10 events logged for a successful connection over the RDG Gateway:

clip_image008

You’ll find 3 event ID 302 events (1 for a HTTP connection and 2 for a UDP connection) as well as 2 Event ID 205 events for the UDP proxy usage.

In the RD Gateway manager, monitoring we can see 1 HTTP and the 2 UDP connections for one RDP Session to a Windows 2012 R2 Server.

clip_image010

So, RDP wise something seems to have changed. But I do not know the story and why.