I recently had to move a Windows Server 2016 VM over to another cluster (2012R2 to 2016 cluster) and to do so I uses shared nothing live migration. After the VM was happily running on the new cluster I kicked of a Veeam backup job to get a first restore point for that VM. Better safe than sorry right?
But the job and the retries failed for that VM. The error details are:
Failed to create snapshot Compellent Replay Manager VSS Provider on repository01.domain.com (mode: Veeam application-aware processing) Details: Job failed (‘Checkpoint operation for ‘FailedVM’ failed. (Virtual machine ID 459C3068-9ED4-427B-AAEF-32A329B953AD). ‘FailedVM’ could not initiate a checkpoint operation: %%2147754996 (0x800423F4). (Virtual machine ID 459C3068-9ED4-427B-AAEF-32A329B953AD)’). Error code: ‘32768’.
Failed to create VM recovery snapshot, VM ID ‘3459c3068-9ed4-427b-aaef-32a329b953ad’.
Also when the job fails over to the native Windows VSS approach when the HW VSS provider fails it still does not work. At first that made me think of a bug that sued to exist in Windows Server 2016 Hyper-V where a storage live migration of any kind would break RCT and new full was needed to fix it. That bug has long since been fixed and no a new full backup did not solve anything here. Now there are various reasons why creating a checkpoint will not succeed so we need to dive in deeper. As always the event viewer is your friend. What do we see? 3 events during a backup and they are SQL Server related.
On top of that the SQLServerWriter is in a non retryable error when checking with vssadmin list writers.
It’s very clear there is an issue with the SQL Server VSS Writer in this VM and that cause the checkpoint to fail. You can search for manual fixes but in the case of an otherwise functional SQL Server I chose to go for a repair install of SQL Server. The tooling for hat is pretty good and it’s probably the fastest way to resolve the issues and any underlying ones we might otherwise still encounter.
After running a successful repair install of SQL Server we get greeted by an all green result screen.
So now we check vssadmin list writers again to make sure they are all healthy if not restart the SQL s or other relevant service if possible. Sometime you can fix it by restarting a service, in that case reboot the server. We did not need to do that. We just ran a new retry in Veeam Backup & Replication and were successful.
There you go. The storage live migration before the backup of that VM made me think we were dealing with an early Windows Server 2016 Hyper-V bug but that was not the case. Trouble shooting is also about avoiding tunnel vision.
It’s all about application consistent hardware VSS provider snapshots
I was browsing to see if I could already download Replay Manager 7.8 for our Compellent (SC) SANs. No luck yet, but I did find the release notes. There was a real gem in there on Off Host Backup Jobs with Veeam and Replay Manager 7.8. We’ll get back to that after the big deal here.
So what kind of goodness is in there? Well obviously there is the way too long overdue support for Windows Server 2016, including the Hyper-V role and its features. That is great news. We now have application consistent hardware VSS provider snapshots.I do not know what took them so long but they need to get with the program here. I have given this a s feedback before and again at DELL EMC World 2017 The Compellent still is one of the best “traditional” centralized storage SAN solutions out there hat punches fare above its weight. On top of that, having looked at Unity form DELL EMC, I can tell you that in my humble opinion the Compellent has no competition from it.
Off Host Backup Jobs Veeam Replay Manager 7.8
Equally interesting to me, as someone who leverages Compellent and Veeam Baclup & Replication with Off Host Proxies (I wrote FREE WHITE PAPER: Configuring a VEEAM Off Host Backup Proxy Server for backing up a Windows Server 2012 R2 Hyper-V cluster with a DELL Compellent SAN (Fiber Channel)) is the following. Under fixed Issues is found:
RMS-24 Off-host backup jobs might fail during the volume discover scan when using Veeam backup software.
I have Off host proxies with transportable snapshots working pretty smooth but it has the occasional hiccup. Maybe some of those will disappear with Replay Manager 7.8. I’m looking forward to putting that to the test and roll forward with Windows Server 2016 for those nodes where we need and want to leverage the Compellent Hardware VSS provider. When I do I’ll let you know the results.
I’m back form attending, speaking, learning and sharing experiences and knowledge at VeeamON 2017 (and DELL EMC World before). It was a blast and I had the opportunity to engage in very interesting discussions with experts from around the globe.
As it was a Veeam event it wil be no surprise that we got some very interesting information about the new Veeam offerings now as well as in the near future. Points of particular interest to me are:
- Veeam backup for file shares. Really this might solve my entire dubio around virtualizing very large capacity clustered files shares (100-200TB) I have to protect. I’m looking forward to testing and leveraging the various restore options like File share rollback. Handy when ransomware just struck.
- I like what Veeam is doing for disaster recovery in Microsoft’s Azure public cloud. Veeam’s Direct Restore and new Power Network (PN) in order to facilitate and automate the disaster recovery process.
- The Veeam agent that can protect Windows ad Linux based physical servers and endpoints, along with applications running in Microsoft Azure, AWS and other public clouds tied into Veeam Backup & Replication. We will also get support for failover clusters with this. Something I have been lobbying for!
- They support native object storage support using Amazon S3, Amazon Glacier, Microsoft Azure Blob etc.
- They announced improved and extended Office 365 protection including OneDrive for Business and SharePoint Online. One of those improvements is very handy with multiple tenants.
Ramsomware did something very significant beyond reminding everyone of the importance of recoverable backups and that is reigniting the interest in tape as a backup medium. The inherent “air gap” that tape offers has become more interesting to many people as ransomware can also delete or encrypt backups. So the 3-2-1 rule has never been more important and is being extended by additional rules of thumb. The product to investigate for me is Starwind Virtual Tape Library (VTL). What I like is that I can have an air gapped backup integrated with Veeam in Amazon AWS. Even while my entire business might run in Azure, this separates my data protection technology and location form my production / development environment. Ideal for maximum isolation to protect us form both external and insider threats and risks while avoiding the need to deal with physical tapes. This is and remains a major concern for operational costs and RTO.
The new capabilities are very welcome to help solve the challenges we have now and the ones we see coming in the near future. We have plenty of ideas and plans to build the next generation of data protection and data availability solutions. Whatever the need, on-premises, IAAS, PAAS, SAAS, private/hybrid/public cloud, the need to protect data against loss and down time is there in one form or another. That is and remains a primary responsibility of any business regardless of the technology. As always, my fellow MVPs and Vanguards are ready, willing and able to get the job done.
I’m travelling to New Orleans for VeeamON2. If you don’t know what that is, please check it out here. I can recommend this conference. Both the attendees and presenters are all very active users of Veeam products and the workloads Veeam protects in real live. That makes for excellent sharing of experiences, insights and knowledge with your peers.
I have the distinct honor of presenting a joint session with Luca Dell’Oca (@dellock6 / http://www.virtualtothecore.com/en/) and Carsten Reachfahl (@hypervserver / https://www.rachfahl.de/). The presentation is called: Throw your backups into ANY window and is on Wednesday, May 17 | 13:30-14:30.
Choosing a storage solution for your backups can be a daunting task: Windows or Linux servers, SMB shares, SAN, NAS, deduplication appliances … But block cloning, a new feature in Windows 2016 and leveraged by Veeam Backup & Replication™, is promising to change this. Available for ReFS 3.1 file systems, this technology allows for insanely reduced transform times and spaceless GFS backups. Or at least, this is what marketing has told us so far, but how good is it in reality? Is an expensive and complex Storage Spaces Direct the only way to consume all the amazing new features? How can I design my new backup repository with these new options in mind? What about encryption and Veeam Scale-out Backup Repository™? Didier Van Hoye, Carsten Rachfahl (both Microsoft MVPs and Veeam Vanguards) and Luca Dell’Oca (Veeam cloud architect) have joined forces to bring you from-the-field information, tips, tricks and ideas to build your next Veeam backup repository with real-life tests and feedback gained from deploying this new powerful combination into multiple environments.
This session is complimentary to the other ones given at VeeamON 2017, both the breakout sessions as well as some of the session the Microsoft MVPs are presenting at the boot. Those sessions combined will send you home with ideas and options on how to leverage Veeam in creative ways to achieve the best possible solution for your needs. Personally I’ll be discussing some of the options you have to get get high available backup targets leveraging ReFSv3.1 in brown field scenarios when a brand new Storage Spaces Direct deployment is not option or when you don’t run Windows Server Windows Server 2016 yet.
Next to that and between attending interesting sessions I’ll be available at the Veeam and Microsoft boots if you want to have questions or want to discuss the technologies. At the Microsoft boot I’ll be presenting a demo focused walk through on how to on Discrete Device Assignment in Windows Server 2016.