Microsoft Azure AD Sync service fails to start – event id 528

IMPORTANT UPDATE: Microsoft released Azure AD Connect 2.1.1.0 on March 24th 2022 which fixes the issue described in this blog post). You can read about it here Azure AD Connect: Version release history | Microsoft Docs The fun thing is the wrote a doc about how to fix it on March 25th 2022. The best option is top upgrade to AD Connect 2.1.1.0 or higher.
Introduction

On Windows Server 2019 and Windows Server 2022 running AD Connect v2, I have been seeing an issue since October/November 2021 where Microsoft Azure AD Sync service fails to start – event id 528. It does not happen in every environment, but it does not seem to go away when it does. It manifests clearly by the Microsoft Azure AD Sync service failing to start after a reboot. If you do application-consistent backups or snapshots, you will notice errors related to the SQL Server VSS writer even before the reboot leaves the Microsoft Azure AD Sync service in a bad state. All this made backups a candidate for the cause. But that does not seem to be the case.

Microsoft Azure AD Sync service fails to start - event id 528
Microsoft Azure AD Sync service fails to start – event id 528

In the application event log, you’ll find Event ID 528 from SQLLocalDB 15.0 with the below content.

Windows API call WaitForMultipleObjects returned error code: 575. Windows system error message is: {Application Error}
The application was unable to start correctly (0x%lx). Click OK to close the application.
Reported at line: 3714.

Getting the AD Connect Server operational again

So, what does one do? Well, a Veeam Vanguard turns to Veeam and restores the VM from a restore point that a recent known good AD Connect installation.

But then the issue comes back

But then it comes back. Even worse, the AD Connect staging server suffers the same fate. So, again, we restore from backups. And guess what, a couple of weeks later, it happens again. So, you rebuild clean AD Connect VMs, and it happens again. We upgraded to every new version of AD Connect but no joy. You could think it was caused by failed updates or such, but no.

The most dangerous time is when the AD Connect service restarts. Usually that is during a reboot, often after monthly patching.

Our backup reports a failure with the application consistent backup of the AD Connect Server, often before Azure does so. The backup notices the issues with LocalDB before the AD Sync Service fails to start due to the problems.

The failing backups indicate that there is an issue with the LoclaDB database …

However, if you reboot enough, you can sometimes trigger the error. No backups are involved, it seems. That means it is not related to Veeam or any other application consistent backup. The backup process just stumbles over the LocalDB issue. It does not cause it. The error returns if we turn off application-consistent backups in Veeam any way. We also have SAN snapshots running, but these do not seem to cause the issue.

We did try all the tricks from an issue a few years back with backing up AD Connect servers. See https://www.veeam.com/kb2911 but even with the trick to prevent the unloading of the user profile
COM+ application stops working when users logs off – Windows Server | Microsoft Docs we could not get rid of the issue.

So backups, VSS, it seems there is a correlation but not causation.

What goes wrong with LocalDB

After a while, and by digging through the event and error logs of a server with the issue, we find that somehow, the model.mdf and model.ldf are toast for some inexplicable reason on a pseudo regular basis. Below you see a screenshot from the C:\Windows\ServiceProfiles\ADSync\AppData\Local\Microsoft\Microsoft SQL Server Local DB\Instances\ADSync2019\Error.log. Remember your path might differ.

That’s it, the model db seems corrupt for some reason.

You’ll find entries like “The log scan number (37:218:29) passed to log scan in database ‘model’ is not valid. This error may indicate data corruption or that the log file (.ldf) does not match the data file (.mdf).”

Bar restoring from backup, the fastest way to recover is to replace the corrupt model DB files with good ones. I will explain the process here because I am sure some of you don’t have a recent, good know backup.

Sure, you can always deploy new AD Connect servers, but that is a bit more involved, and as things are going, they might get corrupted as well. Again, this is not due to cosmic radiation on a one-off server. Now we see it happen sometime three weeks to a month apart, sometimes only a few days apart.

Manual fix by replacing the corrupt model dd files

Once you see the  SQLLocalDB event ID 528 entries in the application logs when your Microsoft Azure AD Sync service fails to start, you can do the following. First, check the logs for corruption issues with model DB. You’ll find them. To fix the problem, do the following.

Disable the Microsoft Azure AD Sync service. To stop the service that will hang in “starting” you will need to reboot the host. You can also try and force kill ADSync.exe via its PID

Depending on what user account the AD Sync Service runs under, you need to navigate to a different path. If you run under NT SERVICE\ADSync you need to navigate to

Microsoft Azure AD Sync service fails to start - event id 528
The account the Microsoft Azure AD Sync service runs under

C:\Windows\ServiceProfiles\ADSync\AppData\Local\Microsoft\Microsoft SQL Server Local DB\Instances\ADSync2019

Welcome to the home of the AD Connect LocalDB model database

If you don’t use the default account but another one, you need to go to C:\Users\ YOURADSyncUSER\AppData\Local\Microsoft\Microsoft SQL Server Local DB\Instances\ADSync2019

Open a second explorer Windows and navigate to C:\Program Files\Microsoft SQL Server\150\LocalDB\Binn\Templates. From there, you copy the model.mdf and modellog.ldf files and paste those in the folder you opened above, overwriting the existing, corrupt model.mdf and model.ldf files.

You can now change the Microsoft Azure AD Sync service back to start automatically and start the service.

If all goes well, the Microsoft Azure AD Sync service is running, and you can synchronize to your heart’s content.

Conclusion

If this doesn’t get resolved soon, I will automate the process. Just shut down or kill the ADSync process and replace the model.mdf and model.ldf files from a known good copy.

Here is an example script, which needs more error handling but wich you can run manually or trigger by monitoring for event id 528 or levering Task Scheduler. As always run this script in the lab first. Test it, make sure you understand what it does. You are the only one responsible for what you run on your server! Once you are done testing replace Write-Host with write-output or turn it into a function and use cmdletbinding and param to gain write-verbose if you don’t want all the output/feedback. Bothe those options are more automation friendly.

cls
$SQLServerTemplates = "C:\Program Files\Microsoft SQL Server\150\LocalDB\Binn\Templates"
$ADConnectLocalDB = "C:\Windows\ServiceProfiles\ADSync\AppData\Local\Microsoft\Microsoft SQL Server Local DB\Instances\ADSync2019"

Write-Host -ForegroundColor Yellow "Setting ADSync startup type to disabled ..."
Set-Service ADSync -StartupType Disabled

Write-Host -ForegroundColor Yellow "Stopping ADSync service  ..."
Stop-Service ADSync -force

$ADSyncStatus = Get-Service ADSync

if ($ADSyncStatus.Status -eq 'Stopped'){
    Write-Host -ForegroundColor Cyan "The ADSync service has been stopped  ..."
}
else {
    if ($ADSyncStatus.Status -eq 'Stopping' -or $ADSyncStatus.Status -eq 'Starting'){
        
        Write-Host -ForegroundColor Yellow "Setting ADSync startup type to disabled ..."
        Set-Service ADSync -StartupType Disabled

        Write-Host -ForegroundColor Red "ADSync service was not stopped but stuck in stoping or starting ..."
        $ADSyncService = Get-CimInstance -class win32_service | Where-Object name -eq 'ADSync'
        $ADSyncProcess = Get-Process | Where-Object ID -eq $ADSyncService.processid

        #Kill the ADSync process if need be ...
        Write-Host -ForegroundColor red "Killing ADSync service processs forcfully ..."
        Stop-Process $ADSyncProcess -Force

        #Kill the sqlserver process if need be ... (in order to be able to overwrite the corrupt model db files)
        Write-Host -ForegroundColor red "Killing sqlserver process forcfully ..."
         $SqlServerProcess = Get-Process -name "sqlservr" -ErrorAction SilentlyContinue
         if($SqlServerProcess){
        Stop-Process $SqlServerProcess -Force}

        }
    }

$ADSyncStatus = Get-Service ADSync
if ($ADSyncStatus.Status -eq 'Stopped'){

    Write-Host -ForegroundColor magenta "Copy known good copies of model DB database to AD Connect LocaclDB path file ..."
    Copy-Item "$SQLServerTemplates\model.mdf" $ADConnectLocalDB

    Write-Host -ForegroundColor magenta "Copy known good copy of model DB log file to AD Connect LocaclDB path ..."
    Copy-Item "$SQLServerTemplates\modellog.ldf" $ADConnectLocalDB


    Write-Host -ForegroundColor magenta "Setting ADSync startup type to automatic ..."
    Set-Service ADSync -StartupType Automatic

    Write-Host -ForegroundColor magenta "Starting ADSync service ..."
    Start-Service ADSync
}

$ADSyncStatus = Get-Service ADSync
if ($ADSyncStatus.Status -eq 'Running' -and $ADSyncStatus.StartType -eq 'Automatic'){
    Write-Host -ForegroundColor green "The ADSync service is running ..."
}
else {
    Write-Host -ForegroundColor Red "ADSync service is not running, something went wrong! You must trouble shoot this"
}


That fixes this cause for when Microsoft Azure AD Sync service fails to start – event id 528. For now, we keep an eye on it and get alerts from the AD Connect health service in Azure when things break or when event id occurs on the AD Connect servers. Let’s see if Microsoft comes up with anything.

IMPORTANT UPDATE: Microsoft released Azure AD Connect 2.1.1.0 on March 24th 2022 which fixes the issue described in this blog post). You can read about it here Azure AD Connect: Version release history | Microsoft Docs The fun thing is the wrote a doc about how to fix it on March 25th 2022. The best option is top upgrade to AD Connect 2.1.1.0 or higher.

PS: I am not the only one seeing this issue Azure AD Sync Connect keeps getting corrupted – Spiceworks

IIS 6.0 SMTP Service in-place upgrade to Windows Server 2022

IIS 6.0 SMTP Service in-place upgrade to Windows Server 2022

This will be a “notes from the field” type of blog post where I will guide you to successfully execute an IIS 6.0 SMTP Service in-place upgrade to Windows Server 2022. In this case, the original operating system version is Windows Server 2019. However, these notes can be used for upgrades between other Windows Server versions as well.

Yes, there are still valid reasons to run an SMTP relay service today. I use SendGrid as a smart host with these and I actually have these setup behind a KEMP LoadMaster for High Availability.

What could go wrong?

What could go wrong? Well, nothing unless you didn’t plan certain things in advance. Below are the issues you will face. and need to prepare for and fix in order to perform an IIS 6.0 SMTP Service in-place upgrade to Windows Server 2022

  • The IIS 6 Management Console will be missing

For some reason that gets dropped during the in-place upgrade. The fix is to reinstall it. Easy enough.

  • Your SMTP Virtual services configuration will be wiped out during an in-place upgrade.

Yes, it will be a very empty console. Which is a scary experience if you did not prepare for it.

All your SMTP virtual servers will be gone

The trick is to create a backup and restore it. That way you get your configuration back. So, first of all, create a backup of your IIS configuration. We will go over this later. Secondly, before you can restore your backup you need to reinstall the IIS 6 Management Console as stated above. When you have restored the backup reboot the server, but before you do reconfigure the Simple Mail Transport Protocol service to start automatically.

  • Simple Mail Transport Protocol Service

The Simple Mail Transport Protocol Service will be set to reset to its default, which is to start manually start instead of automatically. This one is easily fixed but you need to remember to do so as your SMTP Virtual Servers will not be running after a restart. And as you keep your servers patched that will be at least once a month probably.

Step-by-step

  • Backup the current configuration

The easiest way to do this is via appcmd. Open an elevated command prompt and navigate to C:\Windows\System32\inetsrv. Run the following command.

appcmd add backup MYBACKUPNAME

The backup is stored under C:\Windows\System32\inetsrv\Backups\MYBACKUPNAME. Verify it is there, it should contain the following files:

  1. administration.config
  2. applicationHost.config
  3. MBSchema.xml
  4. MetaBase.xml
  5. redirection.config
IIS 6.0 SMTP Service in-place upgrade to Windows Server 2022
Verify your backup files are there

This folder is preserved during the upgrade but you can always grab a copy to be on the safe side.

  • Perform the in-place upgrade

This is the normal process, nothing special about it unless you run into trouble, which is not very likely in well-maintained environments.

  • Reinstall the IIS 6 Management console

This is easily done via the Add Roles and Features Wizard and does not require a reboot.

IIS 6.0 SMTP Service in-place upgrade to Windows Server 2022
Reinstall the IIS 6 Management Console
  • Set the Simple Mail Transport Protocol service to start automatically
IIS 6.0 SMTP Service in-place upgrade to Windows Server 2022
Set the Simple Mail Transport Protocol service startup type to Automatic
  • Restore your IIS backup

Open an elevated command prompt and navigate to C:\Windows\System32\inetsrv. Run the following command.

appcmd restore backup MYBACKUPNAME

  • Restart the server

When you have restarted the server open the IIS 6 Management console. Your SMTP virtual Services should be backup up and running.

IIS 6.0 SMTP Service in-place upgrade to Windows Server 2022
You have your SMTP Virtual Servers back!

Test your SMTP functionality via a PowerShell script for example to verify all is well.

Conclusion

In-place upgrades work quite well but certain roles and configurations have their quirks and issues to solve. Some lab work to test scenarios and their outcome is helpful when preparing an in-place upgrade.

This is the case for IIS 6.0 based SMTP Service role. We have shown you how to work around this and successfully perform an IIS 6.0 SMTP Service in-place upgrade to Windows Server 2022. The thing is, this is not related to Windows Server 2022, it is an IIS 6.0 issue.

With virtual machines leverage the luxury of checkpoints for fast and easy recovery before you begin. Also, make sure you have a tested backup to restore. Always have options and avoid painting yourself into a corner.

IIS and HTTP/3, QUIC, TLS 1.3 in Windows Server 2022

IIS and HTTP/3, QUIC, TLS 1.3 in Windows Server 2022

In this blog post, we will show you how to test IIS and HTTP/3, QUIC, TLS 1.3 in Windows Server 2022. As most of you know by now, Microsoft has released Windows Server 2022 on August 18th, 2021. There are a lot of new and interesting capabilities and features. Some of them are only available in Windows Server 2022 Azure edition. The good news is that in contrast to SMB over QUIC, QUIC for IIS is available in any version of Windows Server 2022.

This will not work out of the box, but I will demonstrate how I got it to work.

Getting TLS 1.3 to work

HTTP/3 uses QUIC for its transport, which is based on TLS 1.3 and Windows Server 2022 supports this. This is due to http.sys which leverages msquic. I have written about QUIC in SMB over QUIC Technology | StarWind Blog (starwindsoftware.com) and discussed SMB over QUIC in-depth in SMB over QUIC: How to use it – Part I | StarWind Blog (starwindsoftware.com) and SMB over QUIC: Testing Guide – Part II | StarWind Blog (starwindsoftware.com).

HTTP/3 avoids “head of line” (HOL) blocking at the transport layer even for multiple streams. This is an improvement over HTTP/2 that still suffered from HOL despite heaving multiple streams in a single connection versus multiple connections in HTTP/1.1. As HTTP/3 leverages TLS 1.3 it also benefits from the benefits it offers.

However, you need to opt-in for TLS 1.3 to work. We do that via a registry key.

reg add "HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\HTTP\Parameters" /v EnableHttp3 /t REG_DWORD /d 1 /f

Without TLS 1.3 you cannot have QUIC and HTTP/3 used QUIC for its transport. You will need to restart http.sys or restart the server.

Below you see HTTP/2 traffic and it is leveraging TLS 1.3

When you check the certificate in the browser you can see that TLS 1.3 is used.

You can also see TLS 1.3 and TCP in WireShark.

Getting QUIC to work

Now we are not done yet, your while you now will see HTTP/2 traffic use TLS 1.3 you won’t see QUIC yet. For that, we need to add another registry key.

The web service or site will need to advertise it is available over HTTP/3. For this, we can use “Alt-Svc” headers in HTTP/2 responses or via HTTP/2 ALTSVC frames. Clients who connect over HTTP/2 can now learn the service’s HTTP/3 endpoint and, if successful, use HTTP/3.

This process happens by sending an HTTP/3 ALPN (“Application-layer Protocol Negotiation”) identifier along with the HTTP/2 responses. the HTTP3/ALPN advertises a specific version of HTTP/3 that the client can use for future requests.

The HTTP/2 ALTSVC frames can be sent via http.sys. This needs to be enabled via a registry key “EnableAltSvc”.

"HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\services\HTTP\Parameters" /v EnableAltSvc /t REG_DWORD /d 1 /f

Again, you will need to restart http.sys or restart the server.

Start testing HTTP/3

Your IIS server via the http.sys service is now capable of serving content over HTTP/3. To check whether it is working you can use WireShark on both the client and the server to verify the web traffic is using QUIC.

Below you can see QUIC traffic to my IIS server being captured.

IIS and HTTP/3, QUIC, TLS 1.3 in Windows Server 202

You can also check this via your browser’s dev tools. The way to do this differs a bit from browser to browser. Below you find a screenshot from Firefox, this has proven the most reliable browser when it comes to effectively negotiating QUIC. Hit F12, select “Network” and add the protocol column to the view. Watch out for HTTP/2 and HTTP/3.

IIS and HTTP/3, QUIC, TLS 1.3 in Windows Server 202

It will help to hit refresh to make sure HTTP/3 is advertised to the client, which can then leverage it. Sometimes hitting refresh too much seems to break QUIC and then you will fall back to HTTP/2, all be it with TLS 1.3.

Any way that’s it for IIS and HTTP/3, QUIC, TLS 1.3 in Windows Server 2022 for now. I hope to come back to this later.

SMB over QUIC POC

SMB over QUIC POC

I have had the distinct pleasure of being one of the first people to implement a SMB over QUIC POC. It was in a proof of concept I did with Windows Server 2022 Azure Edition in public preview.

That was a fun and educational excercise. As a result, I learned a lot. As a result, I decided to write a lab and test guide, primarily for my own reference. But also, to share my experience with others.

SMB over QUIC POC
So happy I did this POC and I am very happy with the results!

You can read the lab guide in a two part series of articles. SMB over QUIC: How to use it – Part I | StarWind Blog (starwindsoftware.com) and SMB over QUIC Testing Guide – Part II | StarWind Blog (starwindsoftware.com)

I am convinded it will fill a need for people that require remote access to SMB file shares without a VPN. Next to that, the integration with the KDC proxy service make it a Kerberos integrated solution. In addition, the KDC Prosy service has the added benefit of allowing for remote password changes.

If you need to get up to speed on what SMB over QUIC is all about I refer your to my article SMB over QUIC Technology | StarWind Blog (starwindsoftware.com). I’m sure that will bring you up to speed.

Finally, I hope you will find these articles useful. I’m pretty sure they will help you with your own SMB over QUIC POC and testing.

Thank your for reading!