Veeam File Share backups and knowledge worker data

Introduction

Today I focus on Veeam File Share backups and knowledge worker data testing. In Veeam NAS and File Share Backups did my 1st testing with the RTM bits of Veeam Backup & Replication V10 File Share backup options. Those tests were focused on a pain point I encounter often in environments with lots of large files: being able to back them up at all! Some examples are medical imaging, insurance, and GIS, remote imaging (satellite images, Aerial photography, LIDAR, Mobile mapping, …).

The amount of data created has skyrocketed driven by not only need but the advances in technology. These technologies deliver ever-better quality images, are more and more affordable, and are ever more applicable in an expanding variety of business cases. This means such data is an important use case. Anyway for those use cases, things are looking good.

But what about Veeam File Share backups and knowledge worker data? Those millions of files in hundreds of thousands of folders. Well, in this blog post I share some results of testing with that data.

Veeam File Share backups and knowledge worker data

For this test we use a 2 TB volume with 1.87TB of knowledge worker data. This resides on a 2 TB LUN, formatted with NTFS and a unit allocation size of 4K.

1.87 TB of knowlegde worker dataon NTFS

The data consists of 2,637,652 files in almost 196,420 folders. The content is real-life accumulated data over many years. It contains a wide variety of file types such as office, text, zip, image, .pdf, .dbf, movie, etc. files of various sizes. This data was not generated artificially. All servers are Windows Server 2019. The backup repository was formatted with ReFS (64K allocation unit size).

Backup test

We back it up with the file server object from an all-flash source to an all-flash target. There is a dedicated 10Gbps backup network in the lab. As we did not have a separate spare lab node we configured the cache on local SSD on the repository. I set the backup I/O control for faster backup. We wanted to see what we could get out of this setup.

Below are the results.

Veeam File Share backups and knowledge worker data
45:28 minutes to backup 1.87 TB of knowledge worker data. I like it.

If you look at the back-up image above you see that the source was the bottleneck. As we are going for maximum speed we are hammering the CPU cores quite a bit. The screenshot below makes this crystal clear.

We have plenty of CPU cores in the lab on our backup source and we put them to work.
The CPU core load on the backup target is far less.

This begs the question if using the file share option would not be a better choice. We can then leverage SMB Direct. This could help save CPU cycles. With SMB Multichannel we can leverage two 10Gbps NICs. So We will repeat this test with the same data. Once with a file share on a stand-alone file server and once with a high available general-purpose file share with continuous availability turned on. This will allow us to compare the File Server versus File Share approach. Continuous availability has an impact on performance and I would also like to see how profound that is with backups. But all that will be for a future blog post.

Restore test

The ability to restore data fast is paramount. It is even mission-critical in certain scenarios. Medical images needed for consultations and (surgical) procedures for example.

So we also put this to the test. Note that we chose to restore all data to a new LUN. This is to mimic the catastrophical loss of the orginal LUN and a recovery to a new one.

The restore takes longer than the backup for the same amount of data, the restore speed is typically slower for large amounts of smaller files.

Below you will find a screenshot from the task manager on both the repository as well as the file server during the restore.

The repository server from where we are restoring the data
The file server where the backup is being restored completely on a new LUN. Note the peak throughput of 6.4 Gbps.

Mind you, this varies a lot and when it hits small files the throughput slows down while the cores load rises

The file server during the restore is having to work the hardest when it has to deal with the least efficient files.

Conclusion

For now, with variable data and lots of small files, it looks that restores take 2.5 to 3 times as long as backups with office worker data. We’ll do more testing with different data. With large image files, the difference is a lot less from our early testing. For now, this gives you a first look at our results with Veeam File Share backups and knowledge worker data As always, test for your self and test multiple scenarios. Your mileage will vary and you have to do your own due diligence. These lab tests are the beginning of mine, just to get a feel for what I can achieve. If you want to learn more about Veeam Backup & Replication go here.Thank you for reading.

Veeam NAS and File Share Backups

Introduction

Veeam NAS and File Share Backups are a new capability in Veeam Backup & Replication V10. We can now backup SMB, NFS shares as well as file server sources. This means it covers Linux and Windows files servers and shares. It can also backup also many NAS devices. There are many of those in both the SME and enterprise market. I know it is fashionable to state that file servers are dead, But that is like saying e-mail is dead. Yes, right until the moment you kill their mailbox. At that moment it is mission critical again.

My first test results with the RTM bits are so good I doing this quick publish to share them with you.

Early testing of Veeam NAS and File Share Backups

As a Veeam Vanguard I got access to the Veeam Backup & Replication V10 RTM bits so I decided to give it a go in some of our proving grounds.

I tested a Windows File Server, a Windows File share and a General Purpose File Share with continuous availability on a 2 node cluster. All operating systems run Windows Server 2019, fully patched at the time of writing.

Veeam NAS and File Share Backups
These are your 3 major options and they cover 99% of file share uses cases out there.

Windows File Server

This is the preferred method if you can use it. That is if you have a Windows or Linux server as opposed to an appliance. The speeds are great and I am flirting with the 10Gbps limits of the NIC. As this is pure TCP it does not leverage SMB Multichannnel or SMB Direct.

9.9Gbps, 800-900MB/s, what is not to like about such early test results?

Windows File Share

With a NAS you might not have the option to leverage the file server object. No worries, we than use the File Share. If it is SMB 3 you can even leverage VSS if the NAS supports it. It might have the added benefit that you can add more file share proxies to do the initial full backup if so required. With File Server you are limited to itself. But it all depends on what the source can deliver and the target ingest.

Veeam NAS and File Share Backups
A file share on Windows Server 2019

Note that we use a SMB 3 file share here. With a properly configured network you can leverage SMB Direct and SMB Multichannel with this.

RDMA and Multichannel – gotta love it.

General Purpose File Share with continuous availability on a 2 node cluster

This one is important to me. I normally only deploy General Purpose File Shares with continuous availability anymore where applicable and possible. SMB 3 haves given us many gifts and I like to leverage them. The ease of maintenance it offers is too good not to use it when possible. Can you say office hours patching of file servers?

So here is a screen shot of a Backup of a General Purpose File Shares with continuous availability where is initiate a fail over of the file share. That explains for the dip in the throughput, but the backup keeps running. Awesome!

Veeam NAS and File Share Backups
Our backup source fails over but the backup keeps running. That is what we like to see. All this also leverages SMB 3 Direct and Multichannel by the way.

Restores

Backups are cool but restores rule. So to finish up this round of testing we share a restore. Not bad, not bad at all. 221.9 GB restored in 5.33 minutes.

Speedy restores are always welcome

More testing to follow

I will do more testing in the future. This will include small office files in large quantities. These early tests are more focused on large image data such as satellite, aerial photography and mobile mapping images. An important use case, hence our early testing focus.

For an overview of Veeam NAS and file share backups as well as the details take a look here for a presentation on the subject by one and only Michael Cade at TFD20.

Conclusion

The Veeam NAS and file share backups in Backup & Replication V10 are delivering great results right from the start. I am happy to see this capability arrive. Without any doubt the only remark I have is that they should have done this sooner. But today, it is here and I am nothing but happy about this.

There are lot of details to this but that will be for later content.

The Airconsole 2.0 Standard

Introduction

Why the AirConsole 2.0 Standard? in IT, you need to configure or troubleshoot network devices every now and then, such as routers, switches, firewalls, … all these have a serial connection for initial setup, troubleshooting or upgrades.

Working at a network rack or in a network room is often tedious and bad for you to very non-ergonomically balance a laptop while managing cables and typing. That is where the Airconsole comes in for me. There are other use cases such as for a permanent setup as a serial server. We also have those, but personally, the Airconsole 2.0 Standard is a handy piece of portable kit for me to have available.

The Airconsole 2.0 Standard

I am using the Airconsole 2.0 Standard. They have different versions and editions (bigger battery, Low Energy versions, etc). For my purposes the standard edition gets the job done.

It offers both Bluetooth and WiFi connectivity, but it also has a RJ45 up-link. This makes it very flexible. Especially combined with the configuration options via the web interface that allows for many network scenarios. I must say I find the lack of TLS for the web interface my biggest criticism.

The Airconsole 2.0 Standard
The air console on my Lab WatchGuard Firebox M200. The usb port on the firebox provides power. Ideal for those longer configuration jobs.

I got the standard edition as I have enough RJ45 to serial adapters lying around. It is easy enough to set up for anyone who has previous basic networking experience.

Using the Airconsole 2.0 Standard on Windows

Next to the apps for mobile devices (Smartphones and tablets with IOs or Android), you can use it on your laptop which will be the most used option for me. A smartphone is nice for a quick check of something. For real work, I prefer a laptop or desktop.

Bluetooth

Bluetooth does not require drivers to be installed. That just works out of the box. It adds the needed COM ports using the built-in Windows Serial Port Profile drivers. Just pair the AirConsole with your device and you should see the Bluetooth COM ports come up.

If you don’t succeed at first, try again. Bluetooth can be a bit finicky at times.

Wifi

The WiFi option requires a driver to be installed on your Windows OS. While the drivers are a bit older they do work well with recent editions such as Windows Server 2019 and Windows 10 as well as older operating systems.

There are a couple of types of this product. Make sure you download the correct drivers and software. For the Airconsole 2.0 Standard, you can find it here, you need to install Apple Bonjour for Windows. First of all, Unpack the zip file

Now, install Bonjour64.exe (I’m hoping no one is still stuck in the 32-bit world). It is very weird for me to install this on Windows which is normally kept free of apple bloatware. But this is for a good cause. After this, you install the com0com-2.2.2.0-setup (64 bit).exe. This will create the first COM port pair (they make the link between your device and the actual serial port). You can add more pairs with different settings to have them ready for your most-used devices.

Connecting to your console via WiFi

Before you go any further make sure you connected your WiFi to the Airconsole-2E WAP.

Without connectivity, nothing much can will happen. Once that is done launch the AirConsoleConnect.exe with admin privileges. You can now select the COM port you want to map the Airconsole to from the COM port combo box. All you need to do is click connect.

This is me connecting to the Firebox M200 CLI over WiFi. I have selected to show the debug dialog.

The Airconsole 2.0 Standard
Connected to the Firebox M200 CLI over WiFi

Varia

  • If both WiFi and Bluetooth are available WiFi is preferred and uses as it gives better performance.
  • Plugin the USB charging cable to a USB port on the network device is possible
  • You can add virtual COM port pairs for WiFi and customize those to your hears content.
  • Via http://192.168.10.1 you can manage your AirConsole. You can configure the network settings (subnet, DHCP, DNS), configure your WiFi (SSID, Network mode, channel, etc.), change the password, upgrade the AirConsole, etc. Pretty nice. As said above, in this day and age we’d hope for TLS 1.2, but we have not https capabilities at all, which is a pity. But then, this is not a permanent setup.
The Airconsole 2.0 Standard

The AirConsole is a handy piece of kit to have around. It is more versatile then I figured when I first got it and with the options available can turn it into a nice serial server.

Intermittent TLS issues with Windows Server 2012 R2 connecting to SQL Server 2016 running on Windows Server 2016 or 2019

Introduction

Someone asked me to help investigate an issue that was hindering client/server applications. They suffered from intermittent TLS issues with Windows Server 2012 R2 connecting to SQL Server 2016 running on Windows Server 2016 or 2019. Normally everything went fine but one in every 250 to 500 connection of the server-client to a database on SQL Server 2016 they got the error below.

System.ServiceModel.CommunicationException: The InstanceStore could not be initialized. —> System.Runtime.DurableInstancing.InstancePersistenceCommandException: The execution of the InstancePersistenceCommand named {urn:schemas-microsoft-com:System.Activities.Persistence/command}CreateWorkflowOwner was interrupted by an error. —> System.Data.SqlClient.SqlException: A connection was successfully established with the server, but then an error occurred during the login process. (provider: SSL Provider, error: 0 – An existing connection was forcibly closed by the remote host.) —> System.ComponentModel.Win32Exception: An existing connection was forcibly closed by the remote host

Now actually retrying the connection in the code could work around this. Good code should have such mechanisms implemented. But in the end, there was indeed an issue.

Intermittent TLS issues with Windows Server 2012 R2 connecting to SQL Server 2016 running on Windows Server 2016 or 2019

I did a quick verification of any network issues. The network was fine, so I could take this cause of the table. The error itself did not happen constantly, but rather infrequently. All indications pointed to a (TLS) configuration issue. “An existing connection was forcibly closed by the remote host” was the clearest hint for this. But then one would expect this to happen every time.

We also checked that the Windows Server 201 R2 hosts were fully up to date and had either .NET 4.7 or .NET 4.8 installed. These versions support TLS 1.2 without issues normally.

Also, on Windows Server 2012 R2 TLS 1.2 is enabled by default and does not require editing the registry to enable it. You have to do this is you want to disable it and re-enable it.

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.2\Client]
“DisabledByDefault”=dword:00000000

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\SecurityProviders\SCHANNEL\Protocols\TLS 1.2\Server]
“DisabledByDefault”=dword:00000000

These were not present on the Windows Server 2012 R2 host but also not on the Windows Server 2016 or 2019 hosts we use for comparison.

We also check if KB3154520 – Support for TLS System Default Versions included in the .NET Framework 3.5 on Windows 8.1 and Windows Server 2012 R2 is installed. This allows using the operating system defaults for SSL and TLS instead of the hardcoded .NET Framework defaults.

For 64-bit operating systems:
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft.NETFramework\v2.0.50727]
“SystemDefaultTlsVersions”=dword:00000001

[HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft.NETFramework\v2.0.50727]
“SystemDefaultTlsVersions”=dword:00000001
For 32-bit operating systems:
[HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft.NETFramework\v2.0.50727]
“SystemDefaultTlsVersions”=dword:00000001


Note If the application has set the ServicePointManager.SecureProtocol in code or through config files to a specific value or uses the SslStream.AuthenticateAs* APIs to specify a specific SslProtocols enum, the registry setting behavior does not occur. But in our test code, we also have control over this. So we can test a lot of permutations.viu

Test code

To properly dive into the issue we needed to reproduce the error at will or at least very fast. So for that, we contacted a dev and asked him to share the code paths that actually made the connections to the databases. This is to verify if there were multiple services connecting and maybe only one had issues. It turned out it was all the same. It all failed in the same fashion.

So we came up with a test program to try and reproduce it as fast as possible even if it occurred infrequently. That ability, to test configuration changes fast, was key in finding a solution. With this test program, we did not see this issue with clients that were running on Windows Server 2016 or Windows Server 2019. Not with the actual services in the environments test or in production nor with our automated test tool. Based on the information and documentation of .NET and Windows Server 2012 R2 this should not have been an issue there either. But still, here we are.

The good thing about the test code is that we can easily play with different settings in regards to the TLS version specified in the code. We noted that using TLS 1.1 or 1.0 would show a drop in connection errors versus TLS 1.2 but not eliminate them. No matter what permutation we tried we just got a difference in frequency of the issue. We were not able to get rid of the error. Now that we had tried to deal with the issue on the application level we decided to focus on the host.

The host

Even with no TLS 1.2 enforced the Windows Server 2012 R2 client hello in its connection to the SQL Server host uses TLS 1.2. The cipher they select ( TLS_DHE_RSA_WITH_AES_256_GCM_SHA384) is actually considered weak. This is evident when we look at the client and server hello to port 1433.

TLS issues with Windows Server 2012 R2 connecting to SQL Server 2016 running on Windows Server 2016 or 2019
Client hello
TLS issues with Windows Server 2012 R2 connecting to SQL Server 2016 running on Windows Server 2016 or 2019
Server Hello

Trying to specify the TLS version in the test code did nor resolve the issue. So we now tried solving it on the host. We are going for best practices on the Windows Server 2012 R2 client side. ! In the end, I ended up doing the following

  • Allowed only TLS 1.2
  • Allowed only Secure ciphers & Ordered those for Perfect Forward Privacy
  • Enforce the OS to always select the TLS version
  • Enforce the use of the most secure (TLS 1.2 as that is the only one we allow)

Note that there is no support for TLS 1.3 at the moment of writing.

After this, we ran our automated tests again and did not see even one occurrence of these issues anymore. This fixed it. It worked without any other changes except for one. There was one application that had not been upgraded to .NET 4.6 or higher and enforced TLS 1.1. So we leaned on the app owners a bit to have them recompile their code. In the end, they went with 4.8.

In the network capture of connections, we see they now select a secure cipher (TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384).. With this setup, we did not experience the flowing infrequent error anymore. “System.Data.SqlClient.SqlException: A connection was successfully established with the server, but then an error occurred during the login process. (provider: SSL Provider, error: 0 – An existing connection was forcibly closed by the remote host.) —> System.ComponentModel.Win32Exception: An existing connection was forcibly closed by the remote host

Client Hello
Server Hello

The cause

We had an assumption is that once in a while and older/weaker ciphers were used and that this causes the connection error. By implementing best practices we disabled those which prevent this from happening. As we did not change the source code om most machines it either used the .Net framework defaults or the specified settings.

You can easily implement best practices by using a community script like Setup Microsoft Windows or IIS for SSL Perfect Forward Secrecy and TLS 1.2. There is also a tool like IISCrypto or IISCryptoCLI. Potentially, you can always automate this and devise a way to configure this wholesale with the tools of your choice if needed.

The solution

As you can read above, what finally did the trick was implementing a TLS 1.2 only best practices configuration on the Windows Server 2012 R2 hosts.

If you run into a similar issue please test any solution, including mine here, before implementing it. TLS versions that software and operating systems require or are able to use differ. So any solution needs to be tested end to end for a particular use case. Even more so if ServicePointManager.SecureProtocol in code or through config files or via  SslStream.AuthenticateAs* APIs are in play.

In this particular environment forcing TLS 1.2 and having the OS control which TLS version is being used by the .NET applications worked. Your mileage may differ and you might need to use a different approach to fix the issue in your environment. Big boy rules apply here!

Conclusion

Tech debt will sooner or later always rear its ugly head. There is a reason I upgrade and update regularly and well ahead of deadlines. I have always been doing that as much as possible (SSL Certs And Achieving “A” Level Security With Older Windows Version) . But in this case that this seemed more like a bug than a configuration issue. It only happened every now and then which meant troubleshooting was a bit more difficult. But that was addressed with a little test program. this helped us test configuration changes to fix these Intermittent TLS issues with Windows Server 2012 R2 connecting to SQL Server 2016 running on Windows Server 2016 or 2019 fast and easily. I shared this case as it might help other people out there struggling with the same issue.