Configuring an interface bond in a Ubuntu Hyper-V guest

Introduction

In this post, we take a look at configuring an interface bond in a Ubuntu Hyper-V guest. But first a quick word about NIC teaming and Hyper-V. In real life, teaming is most often done on physical hardware. But in the lab, or for some edge production cases, you might want to use it in virtual machines. The use case here is virtual machines used for testing and knowledge transfer. We are teaching about creating Veeam Backup & Configuration hardened repositories with XFS and immutability. In that lab, we are emulating a NIC team on hardware servers.

When you need redundant, high available networking for your Hyper-V guests, you normally create a NIC team on the host. You then use that NIC team to make your vSwitch. You can use a traditional LBFO team (depreciated) or a SET switch. The latter is the current technology and the way forward. But in this lab scenario, I am using LBFO, native Windows native NIC teaming.

Configuring an interface bond in a Ubuntu Hyper-V guest
99.9% of all use cases will use teaming on the Hyper-V host

Host teaming provides both bandwidth aggregation, redundancy, and failover. Typically, you do not mess around with NIC teaming in the guest in 99.99% of cases. Below we see a figure showing guest teaming. You need to use two physical NICs for genuine redundancy. Each with its separate virtual switch and uplinked to separate physical switches. Beware that only switch independent teaming is supported in the guest OS, so configure the switches and switch ports accordingly.

Configuring an interface bond in a Ubuntu Hyper-V guest
Hyper-V in guest NIC teaming

In-guest teaming is rarely used for production workloads, that is, bar some exceptions with SR-IOV, but that is another discussion. However, you might have a valid reason to use NIC teaming for lab work, testing, documenting configurations, teaching, etc. Luckily, that is easy to do. Hyper-V has a setting for your vNICs, enabling them to be functional members of a NIC team in a Windows guest OS. Als long as that OS supports native teaming. That is the case for Windows Server 2012 and later.

NIC teaming inside a Hyper-V Guest

For each vNIC member of the NIC team in the guest, you must put a checkmark to “enable this network adapter to be part of a team in the guest operating system” there is nothing more to it. The big caveat here is that each member must reside on a different external vSwitch for failover to work correctly. Otherwise, you will see a “The virtual switch lacks external connectivity” error on the remaining when failing over and packet loss.

Enable NIC teaming o the vNIC that are going to be team membersthe Hyper-V settings

There is nothing more you need to make it work perfectly in a Windows guest VM. As you can see in the image below, both my LAN NIC and the NIC get an address from the DHCP server.

Functional team in the virtual machine. Do test failover to make sure you got it right?

That’s great. But sometimes, I need to have a NIC team inside a Linux guest virtual machine. For example, recently, on Ubuntu 20.04, I went through my typical motions to get in guest NIC teaming or bonding in Linux speak. But, much to my surprise, I did not get an IP address from my DHCP server on my Ubuntu 20.04 guest bond. So, what could be the cause?

Configuring an interface bond in a Ubuntu Hyper-V guest

In Ubuntu, we use netplan to configure our networking and in the image below you can see a sample configuration.

A minimal bond configuration in Ubuntu

I have created a bond using eth0 and eth1, and we should get an IP address from DHCP. The bonding mode is balance-rr. But why I am not getting an IP address. I did check the option “Enable this network adapter to be part of a team in the guest operating system” on both member vNICs.

Well, let’s look at the nic interfaces and the bond. There we see something exciting.

Configuring an interface bond in a Ubuntu Hyper-V guest
Note that the bond and it’s member interfaces have the same MAC address that does not come from the Hyper-V host pool

Note that the bond has a MAC address that is the same as both member interfaces. Also, note that this MAC address does not come from the Hyper-V host MAC address pool and is not what is assigned to the vNIC by Hyper-V as you can see in the image below! That is the big secret.

With MAC addressed unknown to the hypervisor, this smells of something that requires MAC spoofing, doesn’t it? So, I enabled it, and guess what? Bingo!

So what is the difference with Windows when configuring an interface bond in a Ubuntu Hyper-V guest?

The difference with Windows is that an interface bond in an Ubuntu Hyper-V guest requires MAC address spoofing. You have to enable MAC Spoofing on both vNICs members of the Ubuntu virtual machine bond. The moment you do that, you will see you get a DHCP address on the bond and get network connectivity. But why is this needed? In Ubuntu (or Linux in general), the bond interface and its members have a generated MAC address assigned. It does not take one of the MAC addresses of the member vNICs. So, we need MAC spoofing enabled on both member vNIC in the Hyper-V settings for this to work! In a Windows guest, the LBFO team gets one of the MAC addresses of its member vNICs assigned. As such, this does not require NIC spoofing.

With Ubuntu (Linux) you don’t even have to check “enable this network adapter to be part of a team in the guest operating system” on the member vNICs. Note that a guest Linux bond does not need every member interface on a separate vSwitch for failover to work. Not even if you enable “enable this network adapter to be part of a team in the guest operating system.” However, the latter is still ill-advised when you want real redundancy and failover.

Failing compilation with Azure Automation State Configuration: Cannot connect to CIM server. The specified service does not exist as an installed service

Introduction

You can compile Desired State Configuration (DSC) configurations in Azure Automation State Configuration, which functions as a pull server. Next to doing this via the Azure portal, you can also use PowerShell. The latter allows for easy integration in DevOps pipelines and provides the flexibility to deal with complex parameter constructs. So, this is my preferred option. Of course, you can also push DSC configurations to Azure virtual machines via ARM templates. But I like the pull mechanisms for life cycle management just a bit more as we can update the DSC config and push it out when needed. So, that’s all good, but under certain conditions, you can get the following error: Cannot connect to CIM server. The specified service does not exist as an installed service.

When can you get into this pickle?

DSC itself is PowerShell, and that comes in quite handy. Sometimes, the logic you use inside DSC blocks is insufficient to get the job done as needed. With PowerShell, we can leverage the power of scripting to get the information and build the logic we need. One such example is formatting data disks. Configuring network interfaces would be another. A disk number is not always reliable and consistent, leading to failed DSC configurations.
For example, the block below is a classic way to wait for a disk, and when it shows up, initialize, format, and assign a drive letter to it.

xWaitforDisk NTDSDisk {
    DiskNumber = 2

    RetryIntervalSec = 20
    RetryCount       = 30
}
xDisk ADDataDisk {
    DiskNumber = 2
    DriveLetter = "N"
    DependsOn   = "[xWaitForDisk]NTDSDisk"
} 

The disk number may vary depending on whether your Azure virtual machine has a temp disk or not, or if you use disk encryption or not can trip up disk numbering. No worries, DSC has more up its sleeve and allows to use the disk id instead of the disk number. That is truly unique and consistent. You can quickly grab a disk’s unique id with PowerShell like below.

xWaitforDisk NTDSDisk {
    DiskIdType       = 'UniqueID'
    DiskId           = $NTDSDiskUniqueId #'1223' #GetScript #$NTDSDisk.UniqueID
    RetryIntervalSec = 20
    RetryCount       = 30
}

xDisk ADDataDisk {
    DiskIdType  = 'UniqueID'
    DiskId      = $NTDSDiskUniqueId #GetScript #$NTDSDisk.UniqueID
    DriveLetter = "N"
    DependsOn   = "[xWaitForDisk]NTDSDisk"
}

Powershell in compilation error

So we upload and compile this DSC configuration with the below script.

$params = @{
    AutomationAccountName = 'MyScriptLibrary'
    ResourceGroupName     = 'WorkingHardInIT-RG'
    SourcePath            = 'C:\Users\WorkingHardInIT\OneDrive\AzureAutomation\AD-extension-To-Azure\InfAsCode\Up\App\PowerShell\ADDSServer.ps1'
    Published             = $true
    Force                 = $true
}

$UploadDscConfiguration = Import-AzAutomationDscConfiguration @params

while ($null -eq $UploadDscConfiguration.EndTime -and $null -eq $UploadDscConfiguration.Exception) {
    $UploadDscConfiguration = $UploadDscConfiguration | Get-AzAutomationDscCompilationJob
    write-Host -foregroundcolor Yellow "Uploading DSC configuration"
    Start-Sleep -Seconds 2
}
$UploadDscConfiguration | Get-AzAutomationDscCompilationJobOutput –Stream Any
Write-Host -ForegroundColor Green "Uploading done:"
$UploadDscConfiguration


$params = @{
    AutomationAccountName = 'MyScriptLibrary'
    ResourceGroupName     = 'WorkingHardInIT-RG'
    ConfigurationName     = 'ADDSServer'
}

$CompilationJob = Start-AzAutomationDscCompilationJob @params 
while ($null -eq $CompilationJob.EndTime -and $null -eq $CompilationJob.Exception) {
    $CompilationJob = $CompilationJob | Get-AzAutomationDscCompilationJob
    Start-Sleep -Seconds 2
    Write-Host -ForegroundColor cyan "Compiling"
}
$CompilationJob | Get-AzAutomationDscCompilationJobOutput –Stream Any
Write-Host -ForegroundColor green "Compiling done:"
$CompilationJob

So, life is good, right? Yes, until you try and compile that (DSC) configuration in Azure Automation State Configuration. Then, you will get a nasty compile error.

Cannot connect to CIM server. The specified service does not exist as an installed service

“Exception: The running command stopped because the preference variable “ErrorActionPreference” or common parameter is set to Stop: Cannot connect to CIM server. The specified service does not exist as an installed service.”

Or in the Azure Portal:

Cannot connect to CIM server. The specified service does not exist as an installed service

The Azure compiler wants to validate the code, and as you cannot get access to the host, compilation fails. So the configs compile on the Azure Automation server, not the target node (that does not even exist yet) or the localhost. I find this odd. When I compile code in C# or C++ or VB.NET, it will not fail because it cannot connect to a server and validate my code by crabbing disk or interface information at compile time. The DSC code only needs to be correct and valid. I wish Microsoft would fix this behavior.

Workarounds

Compile DSC locally and upload

Yes, I know you can pre-compile the DSC locally and upload it to the automation account. However, the beauty of using the automation account is that you don’t have to bother with all that. I like to keep the flow as easy-going and straightforward as possible for automation. Unfortunately, compiling locally and uploading doesn’t fit into that concept nicely.

Upload a PowerShell script to a storage container in a storage account

We can store a PowerShell script in an Azure storage account. In our example, that script can do what we want, find, initialize, and format a disk.

Get-Disk | Where-Object { $_.NumberOfPartitions -lt 1 -and $_.PartitionStyle -eq "RAW" -and $_.Location -match "LUN 0" } |
Initialize-Disk -PartitionStyle GPT -PassThru | New-Partition -DriveLetter "N" -UseMaximumSize |
Format-Volume -FileSystem NTFS -NewFileSystemLabel "NTDS-DISK" -Confirm:$false

From that storage account, we download it to the Azure VM when DSC is running. This can be achieved in a script block.

$BlobUri = 'https://scriptlibrary.blob.core.windows.net/scripts/DSC/InitialiseNTDSDisk.ps1' #Get-AutomationVariable -Name 'addcInitialiseNTDSDiskScritpBlobUri'
$SasToken = '?sv=2021-10-04&se=2022-05-22T14%3A04%8S67QZ&cd=c&lk=r&sig=TaeIfYI63NTgoftSeVaj%2FRPfeU5gXdEn%2Few%2F24F6sA%3D'
$CompleteUri = "$BlobUri$SasToken"
$OutputPath = 'C:\Temp\InitialiseNTDSDisk.ps1'

Script FormatAzureDataDisks {
    SetScript  = {

        Invoke-WebRequest -Method Get -uri $using:CompleteUri -OutFile $using:OutputPath
        . $using:OutputPath
    }

    TestScript = {
        Test-Path $using:OutputPath
    }

    GetScript  = {
        @{Result = (Get-Content $using:OutputPath) }
    }
} 

But we need to set up a storage account and upload a PowerShell script to a blob. We also need a SAS token to download that script or allow public access to it. Instead of hardcoding this information in the DSC script, we can also store it in automation variables. We could even abuse Automation credentials to store the SAS token securely. All that is possible, but it requires more infrastructure, maintenance, security while integrating this into the DevOps flow.

PowerShell to generate a PowerShell script

The least convoluted workaround that I found is to generate a PowerShell script in the Script block of the DSC configuration and save that to the Azure VM when DSC is running. In our example, this becomes the below script block in DSC.

Script FormatAzureDataDisks {
    SetScript  = {
        $PoshToExecute = 'Get-Disk | Where-Object { $_.NumberOfPartitions -lt 1 -and $_.PartitionStyle -eq "RAW" -and $_.Location -match "LUN 0" } | Initialize-Disk -PartitionStyle GPT -PassThru | New-Partition -DriveLetter "N" -UseMaximumSize | Format-Volume -FileSystem NTFS -NewFileSystemLabel "NTDS-DISK" -Confirm:$false'
        $ PoshToExecute | out-file $using:OutputPath
        . $using:OutputPath
    }
    TestScript = {
        Test-Path $using:OutputPath 
    }
    GetScript  = {
        @{Result = (Get-Content $using:OutputPath) }
    }
}

So, in SetScript, we build our actual PowerShell command we want to execute on the host as a string. Then, we persist to file using our $OutputPath variable we can access inside the Script block via the $using: OutputPath. Finally, we execute our persisted script by dot sourcing it with “. “$using:OutputPath” In TestScript, we test for the existence of the file and ignore the output of GetScript, but it needs to be there. The maintenance is easy. You edit the string variable where we create the PowerShell to save in the DSC configuration file, which we upload and compile. That’s it.

To be fair, this will not work in all situations and you might need to download protected files. In that case, the above will solutions will help out.

Conclusion

Creating a Powershell script in the DSC configuration file requires less effort and infrastructure maintenance than uploading such a script to a storage account. So that’s the pragmatic trick I use. I’d wish the compilation to an automation account would succeed, but it doesn’t. So, this is the next best thing. I hope this helps someone out there facing the same issue to work around the error: Cannot connect to CIM server. The specified service does not exist as an installed service.

Attending VeeamON 2022

I am attending VeeamON 2022

Yes, I am attending VeeamON 2022. So should you! I mean it. Data protection is becoming ever more diverse. That means you need to keep up and invest in yourself. That is what I do, nearly every day. I want to, I need to and I like to do so.

Attending VeeamON 2022
https://www.veeam.com/veeamon

The landscape has fragmented due to locations such as on-prem, hybrid, cloud, service models like IaaS, PaaS, SaaS, and technologies such as virtual machines and containers. And that is only scratching the surface of the challenges we face while protecting our data. We often have a wide mix of the above and technology trends evolve fast.

Keep learning

The key is to keep learning. That takes a never-ending commitment and effort. We learn in many different ways and VeeamON caters to all of them. Theory, practical guidance, hands-on labs, exams, interaction with peers and industry experts. You name it, VeeamON has it!

Education

Education leads towards a better understanding so you can analyze challenges, design solutions, see relations, and understand dependencies. It is acquiring knowledge that is used to become better at your job by using the technologies optionally in the ecosystem where they need to deliver their value.

Training

Training is getting ready to deploy and operate solutions. It is very focused on specific jobs at hand. That doesn’t necessarily make it easy. On the contrary, it can be difficult but well-trained people can make the hard look easy and look smart because the knowledge and skill have been drilled into them. It is that simple, but again, that doesn’t make it easy.

Networking

Exchanging ideas, experiences, solutions, techniques with others helps us all learn and grow. It builds professional relations that source the common brain of the community so everyone gains. It helps your clients, your employers, your colleagues, and yourself grow and learn. That is good for your job, your career, or your business, whichever it happens to be in your case.

Join me in attending VeeamOn 2022

Join me and the excellent crew Veeam is bringing to bear at VeeamON 2022. You can join online or in-person in Las Vegas. Register here! Online is free, bar the investment of your time. But trust me, you are not a second-class citizen, it is a real and valuable conference. If you are attending in person, be ready for an immersive experience!

Las Vegas – oh my, I want to attend VeeamON 2022 and go on a long road trip after.

I would love to go in person and enjoy the immersion in a world of expertise and learning at the conference, but alas, it will not be this year. If you can attend, do whatever it takes to convince your boss or yourself, it is a rich and rewarding experience, that pays itself back in no time. If you can’t make it, don’t despair, join online like I will. Know that there will be other chances to attend and if the boss is the biggest issue, there are better bosses out there ;-).

Microsoft Azure AD Sync service fails to start – event id 528

IMPORTANT UPDATE: Microsoft released Azure AD Connect 2.1.1.0 on March 24th, 2022 which fixes the issue described in this blog post). You can read about it here Azure AD Connect: Version release history | Microsoft Docs The fun thing is they wrote a doc about how to fix it on March 25th, 2022. The best option is to upgrade to AD Connect 2.1.1.0 or higher.

IMPORTANT UPDATE 2: Upgrade to version 2.1.15.0 (or higher) as that version also addresses LocalDB corruption issues!
Introduction

On Windows Server 2019 and Windows Server 2022 running AD Connect v2, I have been seeing an issue since October/November 2021 where Microsoft Azure AD Sync service fails to start – event id 528. It does not happen in every environment, but it does not seem to go away when it does. It manifests clearly by the Microsoft Azure AD Sync service failing to start after a reboot. If you do application-consistent backups or snapshots, you will notice errors related to the SQL Server VSS writer even before the reboot leaves the Microsoft Azure AD Sync service in a bad state. All this made backups a candidate for the cause. But that does not seem to be the case.

Microsoft Azure AD Sync service fails to start - event id 528
Microsoft Azure AD Sync service fails to start – event id 528

In the application event log, you’ll find Event ID 528 from SQLLocalDB 15.0 with the below content.

Windows API call WaitForMultipleObjects returned error code: 575. Windows system error message is: {Application Error}
The application was unable to start correctly (0x%lx). Click OK to close the application.
Reported at line: 3714.

Getting the AD Connect Server operational again

So, what does one do? Well, a Veeam Vanguard turns to Veeam and restores the VM from a restore point that a recent known good AD Connect installation.

But then the issue comes back

But then it comes back. Even worse, the AD Connect staging server suffers the same fate. So, again, we restore from backups. And guess what, a couple of weeks later, it happens again. So, you rebuild clean AD Connect VMs, and it happens again. We upgraded to every new version of AD Connect but no joy. You could think it was caused by failed updates or such, but no.

The most dangerous time is when the AD Connect service restarts. Usually that is during a reboot, often after monthly patching.

Our backup reports a failure with the application consistent backup of the AD Connect Server, often before Azure does so. The backup notices the issues with LocalDB before the AD Sync Service fails to start due to the problems.

The failing backups indicate that there is an issue with the LoclaDB database …

However, if you reboot enough, you can sometimes trigger the error. No backups are involved, it seems. That means it is not related to Veeam or any other application consistent backup. The backup process just stumbles over the LocalDB issue. It does not cause it. The error returns if we turn off application-consistent backups in Veeam any way. We also have SAN snapshots running, but these do not seem to cause the issue.

We did try all the tricks from an issue a few years back with backing up AD Connect servers. See https://www.veeam.com/kb2911 but even with the trick to prevent the unloading of the user profile
COM+ application stops working when users logs off – Windows Server | Microsoft Docs we could not get rid of the issue.

So backups, VSS, it seems there is a correlation but not causation.

What goes wrong with LocalDB

After a while, and by digging through the event and error logs of a server with the issue, we find that somehow, the model.mdf and model.ldf are toast for some inexplicable reason on a pseudo regular basis. Below you see a screenshot from the C:\Windows\ServiceProfiles\ADSync\AppData\Local\Microsoft\Microsoft SQL Server Local DB\Instances\ADSync2019\Error.log. Remember your path might differ.

That’s it, the model db seems corrupt for some reason.

You’ll find entries like “The log scan number (37:218:29) passed to log scan in database ‘model’ is not valid. This error may indicate data corruption or that the log file (.ldf) does not match the data file (.mdf).”

Bar restoring from backup, the fastest way to recover is to replace the corrupt model DB files with good ones. I will explain the process here because I am sure some of you don’t have a recent, good know backup.

Sure, you can always deploy new AD Connect servers, but that is a bit more involved, and as things are going, they might get corrupted as well. Again, this is not due to cosmic radiation on a one-off server. Now we see it happen sometime three weeks to a month apart, sometimes only a few days apart.

Manual fix by replacing the corrupt model dd files

Once you see the  SQLLocalDB event ID 528 entries in the application logs when your Microsoft Azure AD Sync service fails to start, you can do the following. First, check the logs for corruption issues with model DB. You’ll find them. To fix the problem, do the following.

Disable the Microsoft Azure AD Sync service. To stop the service that will hang in “starting” you will need to reboot the host. You can also try and force kill ADSync.exe via its PID

Depending on what user account the AD Sync Service runs under, you need to navigate to a different path. If you run under NT SERVICE\ADSync you need to navigate to

Microsoft Azure AD Sync service fails to start - event id 528
The account the Microsoft Azure AD Sync service runs under

C:\Windows\ServiceProfiles\ADSync\AppData\Local\Microsoft\Microsoft SQL Server Local DB\Instances\ADSync2019

Welcome to the home of the AD Connect LocalDB model database

If you don’t use the default account but another one, you need to go to C:\Users\ YOURADSyncUSER\AppData\Local\Microsoft\Microsoft SQL Server Local DB\Instances\ADSync2019

Open a second explorer Windows and navigate to C:\Program Files\Microsoft SQL Server\150\LocalDB\Binn\Templates. From there, you copy the model.mdf and modellog.ldf files and paste those in the folder you opened above, overwriting the existing, corrupt model.mdf and model.ldf files.

You can now change the Microsoft Azure AD Sync service back to start automatically and start the service.

If all goes well, the Microsoft Azure AD Sync service is running, and you can synchronize to your heart’s content.

Conclusion

If this doesn’t get resolved soon, I will automate the process. Just shut down or kill the ADSync process and replace the model.mdf and model.ldf files from a known good copy.

Here is an example script, which needs more error handling but wich you can run manually or trigger by monitoring for event id 528 or levering Task Scheduler. As always run this script in the lab first. Test it, make sure you understand what it does. You are the only one responsible for what you run on your server! Once you are done testing replace Write-Host with write-output or turn it into a function and use cmdletbinding and param to gain write-verbose if you don’t want all the output/feedback. Bothe those options are more automation friendly.

cls
$SQLServerTemplates = "C:\Program Files\Microsoft SQL Server\150\LocalDB\Binn\Templates"
$ADConnectLocalDB = "C:\Windows\ServiceProfiles\ADSync\AppData\Local\Microsoft\Microsoft SQL Server Local DB\Instances\ADSync2019"

Write-Host -ForegroundColor Yellow "Setting ADSync startup type to disabled ..."
Set-Service ADSync -StartupType Disabled

Write-Host -ForegroundColor Yellow "Stopping ADSync service  ..."
Stop-Service ADSync -force

$ADSyncStatus = Get-Service ADSync

if ($ADSyncStatus.Status -eq 'Stopped'){
    Write-Host -ForegroundColor Cyan "The ADSync service has been stopped  ..."
}
else {
    if ($ADSyncStatus.Status -eq 'Stopping' -or $ADSyncStatus.Status -eq 'Starting'){
        
        Write-Host -ForegroundColor Yellow "Setting ADSync startup type to disabled ..."
        Set-Service ADSync -StartupType Disabled

        Write-Host -ForegroundColor Red "ADSync service was not stopped but stuck in stoping or starting ..."
        $ADSyncService = Get-CimInstance -class win32_service | Where-Object name -eq 'ADSync'
        $ADSyncProcess = Get-Process | Where-Object ID -eq $ADSyncService.processid

        #Kill the ADSync process if need be ...
        Write-Host -ForegroundColor red "Killing ADSync service processs forcfully ..."
        Stop-Process $ADSyncProcess -Force

        #Kill the sqlserver process if need be ... (in order to be able to overwrite the corrupt model db files)
        Write-Host -ForegroundColor red "Killing sqlserver process forcfully ..."
         $SqlServerProcess = Get-Process -name "sqlservr" -ErrorAction SilentlyContinue
         if($SqlServerProcess){
        Stop-Process $SqlServerProcess -Force}

        }
    }

$ADSyncStatus = Get-Service ADSync
if ($ADSyncStatus.Status -eq 'Stopped'){

    Write-Host -ForegroundColor magenta "Copy known good copies of model DB database to AD Connect LocaclDB path file ..."
    Copy-Item "$SQLServerTemplates\model.mdf" $ADConnectLocalDB

    Write-Host -ForegroundColor magenta "Copy known good copy of model DB log file to AD Connect LocaclDB path ..."
    Copy-Item "$SQLServerTemplates\modellog.ldf" $ADConnectLocalDB


    Write-Host -ForegroundColor magenta "Setting ADSync startup type to automatic ..."
    Set-Service ADSync -StartupType Automatic

    Write-Host -ForegroundColor magenta "Starting ADSync service ..."
    Start-Service ADSync
}

$ADSyncStatus = Get-Service ADSync
if ($ADSyncStatus.Status -eq 'Running' -and $ADSyncStatus.StartType -eq 'Automatic'){
    Write-Host -ForegroundColor green "The ADSync service is running ..."
}
else {
    Write-Host -ForegroundColor Red "ADSync service is not running, something went wrong! You must trouble shoot this"
}


That fixes this cause for when Microsoft Azure AD Sync service fails to start – event id 528. For now, we keep an eye on it and get alerts from the AD Connect health service in Azure when things break or when event id occurs on the AD Connect servers. Let’s see if Microsoft comes up with anything.

IMPORTANT UPDATE: Microsoft released Azure AD Connect 2.1.1.0 on March 24th 2022 which fixes the issue described in this blog post). You can read about it here Azure AD Connect: Version release history | Microsoft Docs The fun thing is the wrote a doc about how to fix it on March 25th 2022. The best option is top upgrade to AD Connect 2.1.1.0 or higher.

PS: I am not the only one seeing this issue Azure AD Sync Connect keeps getting corrupted – Spiceworks