Continuous available general purpose file shares & ReFSv3 provide high available backup targets

Introduction

In our previous two blog posts on Veeam and SMB 3 we’ve seen how and when Veeam Backup & Replication can leverage SMB Multichannel and SMB Direct. See Veeam Backup & Replication leverages SMB Multichannel and Veeam Backup & Replication Preferred Subnet & SMB Multichannel.The benefits of this are more bandwidth, high availability, better throughput and with RDMA low latency and CPU offload. What’s not to like, right? In a world where the compute and networks need keeps rising due to the storage capabilities (flash storage) pushing the limits this is all very welcome.

We have also seen earlier that Veeam B & R 9.5 leverages ReFSv3 in Windows Server 2016. This provides clear and present benefits in regards to space efficiencies and speed with many backup file related operations. Read Veeam Leads the way by leveraging ReFSv3 capabilities

When it comes to ReFSv3 in Windows Server 2016 most of the focus has gone to solutions based around Storage Spaces Direct (S2D). That’s a great solution and it is the poster child use case of these technologies.

But what other options do you have out there to build efficient and effective high available backup targets creatively except for S2D? What if you would like to repurpose existing hardware to build those? Let’s take a look together at how continuous available general purpose file shares & ReFSv3v3 provide high available backup targets

CSV, S2D, ReFSv3 & Archival Data

In Windows Server 2016, traditional shared storage (iSCSI, FC, Shared SAS, Shared RAID) with CSV are not recommend to be used with ReFSv3. Why isn’t exactly clear. The biggest impact you’ll see is the performance difference when not writing to the owner node of the CSV in this use case. Even with a well configured RDMA network that difference is significant. But that doesn’t mean that the performance is bad. It’s just that many of the super-fast meta data operations are relatively and significantly slower when compared each other, not that any of these two are slow.

clip_image002

Microsoft does state that an S2D with ReFSv3 and SOFS shares can be used for archival data. Storage spaces and ReFSv3 also have the benefit of offering automatic repair of corrupt data from a redundant copy on the fly even when needed. So yes, the best know supported scenario is this one.

Continuous available general purpose file shares and ReFSv3 provide high available backup targets

But what if we need a high available backup target and would love to leverage ReFSv3 with Veeam Backup & Replication 9.5? Well, you can have 95% of your cookie and eat it to. All this without ignoring the cautions offered.

We could set up SOFS shares on a Windows Server 2016 Cluster with ReFSv3 with traditional shared storage. Some storage vendors do state this is supported actually.

That only means you don’t have the auto repair functionality ReFSv3 combined with storage spaces offers. But perhaps you want to avoid the risk of using ReFSv3 with CSV in a non S2D scenario all together. What you could do is forgo ReFSv3 and use NTFS. How well this will work for archival data or backup is something you’ll need to test and find out how well this holds up. There is not much info is out there, only other cautions and warnings that might keep you up at night.

There is another scenario however and that is using Windows Server 2016 failover clustering to set up continuously available general purpose file shares that leverage SMB3 transparent failover.

The good news is that general purpose file shares (no CSV) do work consistently with ReFSv3 because such a share/LUN is only exposed on one cluster node at the time, the owner. By having multiple shares and setting preferred owners we can load balance the workload across all cluster nodes.

Thank to continuous availability for general file shares and SMB 3 transparent failover we can still get a high available backup target this way. The failover is fast enough to make this happen and all we see with Veeam Backup & Replication is a short pause in throughput before it resumes after failover. To put the icing on the cake, you can leverage SMB multichannel SMB Direct for both backup and restores.

I would take a sizeable whitepaper to walk through the setup so instead I’ll show you a a quick video of a POC we did in the lab here https://vimeo.com/212886392.

clip_image004

If you want to learn more come to the community & other conferences I’m speaking at and will be around for Ask The Experts time opportunities. I’ll be at the German Hyper-V community meet up, The Cloud & Datacenter Conference in Germany 2017, Dell EMC World 2017 and last but not least VeeamON 2017 (see  May 2017 will be a travelling month). 

Conclusion

What do you lose?

Potentially there is one big loss in regards to the capabilities of ReFSv3 with this solution when you are not using storage spaces. This is that you lose the capability to automatic repair of corrupt data. The ability of ReFSv3 to do so is tied into the redundant copies of Storage Spaces (parity/mirror).

What do you get?

That’s fine, the strength of this design is that you get the speed and space efficiencies of ReFSv3and high available backup targets in way more scenarios than “just” S2D. After all, not everyone is in a position to choose their storage fabric for backup targets green field or at will. But they might be able to leverage existing storage and opt to use SMB 3 for their data transport.

So even if you can’t have it all, you can still build very good solutions. It offers ReFSv3 benefits and high availability for your backup target via transparent failover with SMB transparent failover on continuous available general purpose file shares. This also only requires Windows Server 2016 Standard Edition, which is a cost saving. You get to leverage SMB Multichannel and SMB Direct. All this while not ignoring the cautions of using ReFSv3 in certain scenarios.

On top of that, if you use NTFS with this approach it will also work for Windows Server 2012 (R2) as the OS for the backup target cluster hosts.

Disclaimer

I do not work for or at Microsoft, nor am I perfect or infallible just because I’m an MVP. You’ll have to do your own testing and validation. From our testing and without ReFSv3 bugs ruining the show, to me this is a very valid and cost effective approach.

Migrate an old file server to a transparent failover file server with continuous availability

This is not a step by step “How to” but we’ll address some thing you need to do and the tips and tricks that might make things a bit smoother for you.

1) Disable Short file names & Strip existing old file names

Never mind that this is needed to be able to do continuous availability on a file share cluster. You should have done this a long time ago. For one is enhances performance significantly. It also make sure that no crappy apps that require short file names to function can be introduced into the environment. While I’m an advocate for mutual agreements there are many cases where you need to protect users, the business against itself. Being to much of a politician as a technologist can be very bad for the company due to allowing bad workarounds and technology debt to be introduced. Stand tall!

Read up on this here Windows Server 2012 File Server Tip: Disable 8.3 Naming (and strip those short names too. Next to Jose’s great blog read Fsutil 8dot3name on how to do this.

If you still have applications that depend on short file names you need to isolate and virtualize them immediately. I feel sorry for you that this situation exists in your environment and I hope you get the necessary means to deal with swiftly and decisively by getting rid of these applications. Please see The Zombie ISV® to be reminded why.

Some tips:

  • Only use the /F switch if it’s a non system disk and you can afford to do so as you’re moving the data LUN to a new server anyone. Otherwise you might run into issues. See the below example.image
  • If you stumble on path that are too long, intervene. Talk to the owners. We got people to reduce “Human Resources Planning And Evaluations” sub folder & file names reduced to HRMPlanEval. You get the gist, trim them down.
  • You’ll have great success on most files & folders but if they are open. Schedule a maintenance window to make sure you can run without anyone connected to the shares (Stop LanManServer during that maintenance window).image
  • Also verify no other processes are locking any files or folders (anti virus, backups, sync tools etc.)

2) Convert MBR disks to GPT if you can

With ever growing amounts of data to store and protect this makes sense. I’m not saying you need to start doing 64TB disks today but making sure you can grown beyond 2TB is smart. It doesn’t cost anything when you start out with GPT disks from the start.  If you have older LUNs you might want to use the migration as an opportunity to convert MBR LUNs to GPT. That means copying the data and all NTFS permissions.

Please see  NTFS Permissions On A File Server From Hell Saved By SetACL.exe & SetACL Studio for some tools that might help you out when you run into NTFS/ACL permissions and for parsing logs during this operation.

Here’s a useful Robocopy command to start out with:

ROBOCOPY L: V: /MIR /SEC /Z /R:2 /W:1 /V /TS /FP /NP /XA:SH /MT:16 /XD "System Volume Information" *RECYCLE* /LOG:"D:RoboCopyLogsMBR2GPTLUNL2V.txt"

3) Dump the existing shares on the old file sever into a text file for documentation an use on the new file server

Pre Windows Server 2012 the new SMB Cmdlets don’t work, but no fear, we have some other tools to use. Using NET SHARE does work and with you can also show the hidden and system share but the layout is a bit of a mess. I prefer to use.

Get-WmiObject –class Win32_Share > C:tempOldFileServerShares

It’s fast, complete and the layout is very user friendly. Which is what I need for easy use with PowerShell on the W2K12R2  file server. Some of you might say, what about the share security settings. 1) We’re going to cluster so exporting these from the registry doesn’t work and 2) you should have kept this plain vanilla and done security via the NFTS permissions on the folder structure only. But hey I’m a nice guy, so here’s a link to a community PowerShell script if you need to find out the share permissions: http://gallery.technet.microsoft.com/scriptcenter/List-Share-Permissions-83f8c419 I do however encourage you to use this time to consider just using security on NFTS.

4) Create the clustered file shares

Amongst the many gems in Windows Server 2012 R2 are the new SMB PowerShell Cmdlets. They are a simple and great way to create clustered files shares. Read up on these SMB Share Cmdlets and especially New-SmbShare

When we’ve unmapped the LUNs from the old file server and exposed them to the new file server cluster you’re ready to go. You can even reorganize the Shares, consolidate to less but bigger LUNs and, by just adapting the path to the share in the script make sure the users are not confused or nee to learn new shares and adapt how & what they connect to them. Here it goes:

New-SmbShare -Name "TEST2" -path "T:SharesTEST2" -fullaccess Everyone -EncryptData $True -FolderEnumerationMode AccessBased -ConcurrentUserLimit 0 -ScopeName TF-FS-MIG

First and foremost, this is where the good practice of not micro managing file hare permissions will pay back big time. If you have done security via NTFS permissions with AG(U)DLP principle to your folder structure granting should be breeze right?

Before you ask, no you can’t do the old trick of importing the registry export of the shares and their security settings form the old file server when you’re going to cluster the file shares. That might sound bad but with some preparation and the PowerShell I demonstrated above you’ll find it easy enough.

5) Recuperate old file server name (Optional)

After you have decommissioned the old file server you could use a cluster alias to keep the old file server UNC path. This has the drawback you will fall back to connecting to the SMB shares via NTLM as aliases don’t support Kerberos authentication. But there is another trick. Once you got rid of the old server object in AD you can rename. If you can do this you’ll be able to keep Kerberos for authentication.

So after you’ve gotten rid of the old server in Active Directory go to the file server role. Select properties and rename it to recuperate the old files server name.

image

Now look at the resources tab. Right click and select the properties tab of “Server Name”. Rename the DNS Name. That will update the server name and the DNS record. This will cause the role to go down temporarily.

image

Right click and select the properties tab of “File Server”. Rename the UNC path to reflect the older file server name.

image For good measure and to test everything works: stop and restart the cluster role, connect to the shares and voila live should be good. Users can access the transparent failover file server like they used to do with the old non cluster file server and they don’t sacrifice Kerberos to be able to do so!

image

Conclusion

I hope you enjoyed the tips and pointers on migrating an old file server to a  Windows Server 2012 R2 file share cluster. Remember that these tips apply for various permutations of P2V, V2V as well as for P2P migrations.