Disk to Disk Backup Solution with Windows Server 2012 & Commodity DELL Hardware – Part II

As I blogged in a previous post we’ve been building a Disk2Disk based backup solution with commodity hardware as all the appliances on the market are either to small in capacity for our needs, ridiculously expensive or sometimes just suck or a combination of the above (Virtual Library Systems or Virtual Tape Libraries come to mind, one of my biggest technology mistakes ever, at least the ones I had and in my humble opinion Disappointed smile) .

Here’s a logical drawing of what we’re talking about. We are using just two backup media agent building blocks (server + storage)  in our setup for now so we can scale out.

image

Now in future post I hope to be discussing storage spaces & Windows deduplication thrown into the mix.

So what do we get?

Not to shabby …  > 1TB/Hour

image

To great …

image

In close up you are seeing just 2 Windows 2012 Hyper-V cluster nodes, each being backed up over a native LBFO team of 2*1Gbps NIC ports to one Windows Server 2012 Backup Media Agent with a 10Gbps pipe. Look at the max throughput we got  …

image

Sure this is under optimal conditions, but guess what? When doing backup from multiple hosts to dual backup media servers or more we’re getting very fast backups at very low cost compared to some other solutions out there. This is our backup beast Smile. More bandwidth needed at the backup media server? It has dual port 10Gbps that can be teamed and/or leverage SMB 3.0 multichannel. High volume hosts can use 10Gbps at the source as well.

Lessons learned

  • The Windows 2012 networking improvements rock. Upgrade and benefit from it! We’re seeing great results thanks to Multichannel leveraging RSS and in box NIC teaming (LBFO).
  • A couple of 1Gbps NICS teamed on Windows Server 2012 work really well. Don’t worry about not having 10Gbps on all your hosts.
  • Having 10Gbps on your backup media hosts (target) is great as you’ll be pushing a lot of data to them from multiple (source) hosts.
  • Make sure your backup software supports enough streams before it keels over under the load you’re pushing through. More streams means more concurrent files (read VHDs/VMs) and thus more throughput and allows multichannel to shine over RSS capable NICs.
  • Find the sweet sport for number of disks per node and total IOPS versus the throughput you can send to the backup media agents. 4 Nodes of 50TB might be better than 2 nodes of a 100TB. If you can, experiment a bit to find your optimal backup block size.
  • Isolate your backup network traffic from data traffic either physically or by other means (QOS) and don’t route it all over the place to end up where it needs to be.
  • We’re doing this using Dell PowerConnect 5424 (end of life) /5524 switches … no need for the real  high end very expensive gear to do this. The 10Gbps switch, well yes that’s always high end at the moment.
  • Use JBODS with SAS/Storage spaces & you’ll be fine. Select them carefully for performance. You can use bays like the MD3X00 if you want to replicates the backups somewhere otherwise MD12x0 will do or any other decent JBOD => even cheaper. You can also mix, some building blocks that can replicate & other on Storage Spaces /JBOS. Mix and match with different backup needs means you have flexibility. Note that at TechEd Europe (June 2012), in a session by DELL, they mentioned the need for a firmware update with the MD1200 to optimize performance with Storage Spaces.

It’s all about the money in a smart way!

As I said before, you will not get fired for:

  • Increasing backup throughput at least 4 fold (without dedupe)
  • Increasing backup capacity 3.5 fold (without deduplication)
  • Doing the above for 20% of systems that are replaced & new offerings with specialized appliances (even at hilarious discount rates). That’s CAPEX reduction.
  • This helps pay for the primary storage, DRC site & extra SAN for data replication in case of disaster
  • Make backups faster, more reliable & reduce OPEX (The difference for us is huge and important)
  • Putting an affordable scale up & scale out Disk2Disk backup solution into place to the business can safely handle future backup loads as very acceptable costs.
  • It’s a modular solution which we like. On top of that it’s about as zero vendor lock in as it gets. You can mix servers, bays, switches. Use what you like best for your needs. Only the bays have to remain the same within an individual “building block”.

Cost reduction is one thing but look at what we get whilst saving money… wow!

What am I missing?  Specialized dedupe. Yes, but we’re  going for the poor mans workaround there. More on that later.  As long as we get enough throughput that doesn’t hurt us. And give the cost we cannot justify it + it’s way to much vendor lock in. If you can’t get backups done without, sure you might need to walk that route. But for now I’m using a different path. Our money is better spend in different places.

Now how to get the same economic improvements from the backup software? Disk capacity licensing sucks. But we need a solution that can handle this throughput & load , is reliable, has great support & product information, get’s support for new release fast after RTM (come on CommVault, get a move on) and is simple to use ==> even more money and time saved.

Spin off huge file server project?

Why is support for new releases in backup software important. Because the lack of it is causing me delays. Delays cost me, time, money & opportunities. I’m really interested to covert our large LUN file servers to Windows Server 2012 Hyper-V virtual machines, which I now can rather smoothly thanks to those big VDHX sizes that are possible now and slash the backup times of those millions of small files to pieces by backing them up per VHDX over this setup. But I’ll have to wait and see when CommVault will support VHDX files and GPT disks in guests because they are not moving as fast as a leading (and high cost) product should. Altaro & Veeam have them beaten solid there and I don’t like to be held back.

DELL Compellent Hardware VSS Provider & Commvault on Windows Server 2012 Hyper-V nodes – Volume Shadow Copy Service error: Unexpected error querying for the IVssWriterCallback interface. hr = 0x80070005, Access is denied.

As you know by now I’ve been building a high throughput, large volume Disk2Disk backup solution and that has been rather successful. At optimal speed we get 2TB/Hour per backup media server. As we currently have two, we can get to 4TB/hour at maximum throughput Smile. Currently that is, we’ll see if more is possible. The solution, which I’ll blog about later, is based on the Dell Compellent hardware VSS provider (Dell Compellent Replay Manager 6.2.0.9), Windows Server 2012, CommVault 9.0 R2 and PowerVault storage a the target disks and as it’s working now is saving us many hundred of thousands of Euros compared to dedicated D2D backup appliances or solutions.

The entire process has been a fairly smooth one, which was a relief as decent hardware VSS providers are not easy to come by, based on our experience and that of many colleagues. So, once again the DELL Compellent choice is turning out to be a good one. We did have to fix one small issue along the way.

Whilst using the DELL Compellent Hardware VSS Provider with Commvault Simpana 9.0 R2  to do host level back of the virtual machines on Windows Server 2012 Hyper-V cluster nodes. We ran into a small issue. The backup run fine but we saw this error being logged:

Volume Shadow Copy Service error: Unexpected error querying for the IVssWriterCallback interface.  hr = 0x80070005, Access is denied.
. This is often caused by incorrect security settings in either the writer or requestor process.

Operation:
   Gathering Writer Data

Context:
   Writer Class Id: {e8132975-6f93-4464-a53e-1050253ae220}
   Writer Name: System Writer
   Writer Instance ID: {3f4965d8-10ac-411b-bf6d-6a607f237775}

image

The description found here was not helpful in resolving this. This error is found all over the internet with just about any backup product. The possible causes and solutions that are suggested are as follows:

Change the Language for non-Unicode programs to English (United States)

This is not a solution in our case, which would have surprised me if it had been.

image

Eliminate the error condition by adding the access permissions for the domain account that the Compellent Replay Manager Service for Microsoft Servers (VSS) runs under to COM Security of the affected server

As the domain account used for this service is a member of the local administrators group that already has the permissions this would also have surprised me but we tested it anyway. It turns out this is not the cause either.

image

But for your information this is how it’s done:

You can eliminate the error condition by adding the access permissions for the Network Service account to the COM Security of the affected server. To add the access permissions for the Network Service account, do the following:

  • From the Start Menu, select Run
    The Run dialog opens.In the Open field, input dcomcnfg and click OK.
    The Component Services dialog opens.
  • Expand Component Services, Computers, and My Computer.
    Right-click My Computer and click Properties on the pop-up menu.
  • The My Computer Properties dialog opens.
  • Click the COM Security tab.
    Under Access Permission click Edit Default.
  • The Access Permissions dialog opens.
  • From the Access Permissions dialog, add the DOMAINcompellentreplayservice account with Local Access & Remote access allowed (cluser => not just a local host).
  • Close all open dialogs.
  • Restart the computer.

The real cause & solution

According to Microsoft support this issue occurs when using a 3rd party backup program that utilizes Windows VSS (Volume Shadow Service) and has its own requestor. This is indeed the case here. “It looks like the requestor (the backup application) does not allow system writer to call back into their process and hence generates the error in the application log.” This sounds very plausible Winking smile

The fix:

  1. The following example grants access to the "DOMAINcompellentreplayservice" account.
  2. Click on Start, type regedit in the search box.
  3. On the Registry Editor window, navigate to: KEY_LOCAL_MACHINE>SYSTEM>CurrentControlSet>Services>VSS>VssAccessControl
  4. Add a DWORD value with the name: "DOMAINcompellentreplayservice" and set the value to “1”.

image

image

image

And voila: the error has gone when running backups Open-mouthed smile

image

I assume no responsibility if you do this in your environment but I can say that all this works perfectly in our CommVault Simpana 9.0 R2 setup whist backing up the virtual machines on our Hyper-V cluster nodes at the host level using the DELL Compellent hardware VSS provider. And yes this is Windows Server 2012, careful testing, planning helps when being an early adaptor.

Attending the Dell Tech Summit EMEA

As you read this I’m preparing to get on my way to the DELL Tech Summit in Lisbon, Portugal for a few days. I’ll be discussing the needs we have from them as customers (and their competition actually for that matter) when it comes to hardware in the Microsoft landscape in the era of Windows Server 2012.

image

I’m very happy and eager to tell them what, in my humble opinion, they are doing wrong and what they are doing right and even what they are not doing at all Smile  I believe in giving feedback and interaction with vendors. Not that I have any illusion of self importance as to the impact of my voice on the grand scheme of things but if I don’t speak up nothing changes either. As Intel and Microsoft are there as well,  this makes for a good selection of the partners involved. So here I go:

  1. More information on storage features, specifications and roadmaps
  2. Faster information on storage features, specifications and roadmaps
    • Some of these are in regards to Windows Server 2012 & System Center 2012 (Storage Pools & Spaces, SMI-S, ODX, UNMAP, RDMA/SMB3.0 …) and some are more generic like easier & better SAN/Cluster failovers capabilities, ease of use, number of SCSI 3 persistent reservations, etc.
  3. How to address the IOPS lag in the technology evolution. Their views versus my ideas on how to tackle them until we get better solutions.
  4. Plans, if any, for Cluster In a Box (CiB) building blocks for Windows Server 2012 Private Cloud solutions.
  5. When does convergence make sense and when not cost/benefit wise (and at what level). I’d like a bit more insight into what DELLs vision is and how they’ll execute that. What will new storage options mean to that converged network, i.e. SMB 3.0, Multichannel & RDMA capable NICs. Now convergence always seems tied to one tech/protocol (VOIP in the past, FCoE at the moment) and it shouldn’t, plenty of other needs for loads of bandwidth (Live migration, Storage Live Migration, Shared Nothing Live Migration, CSV redirected mode, …).

Now while it’s important to listen to you customers, this is not easy if you want to do it right, far from it. For one we’re all over the place as a group. This is always the case unless you cater to a specialized niche market. But DELL serves both consumers and enterprises form 1 person shops to fortune 500 companies in all fields of human endeavor. That makes for nice cocktail of views and opinions I suspect.

Even more importantly than listening is processing what you hear from your customers. Do you ignore, react, or take it away as more or less valuable information. Information on which to act or not, to use in decision making, and perhaps even in executing those decisions. And let’s face it without execution decisions are pretty academic exercises. In the end management is in control and for all the feedback, advise, research that gathered and done, they are at the steering wheel and they are responsible for the results.

One thing that I do know from my fellow MVPs and the community is that for the past 12 months any vendor who would address those questions with a good plan and communications would be a top favorite while selecting hardware at many customers for a lot of projects.

How To Deploy Windows Server 2012 on DELL UEFI Now–Notes From The field

The most current UEFI OS Deployment on a R810 is a bit finicky when you want to deploy Windows Server 2012 using the normal procedure & selecting “Other OS” as it’s obvious that the entry for Windows Server  2012 is not in there yet. The problem is that the Windows installer doesn’t seem to create the best practice UEFI partitions. It just seems to create a 320MB System Reserved partition and the rest is for your OS installation as Primary partition. In a good (by the book UEFI) install you’d see a layout like this (from Sample: Configure UEFI/GPT-Based Hard Drive Partitions by Using Windows Setup):

image

image

The reason for this seems to be that the firmware is still not 100% up to date for how Windows Server 2012 deals with UEFI installations. This I learned via my very helpful twitter friend Florian Klaffenbach

While an update for the system firmware is in the works and won’t be to long away let me share you how I dealt with this issue. It’s a bit more work but it get’s the job done. At least for me on a R810 with BIOS version 2.7.4.

I’m copying and adapting the step by step from Microsoft Windows Server 2012 Early Adopter Guide – Dell here and adapting it to how I worked around it. It’s “magic” Winking smile.

Installing Using Dell Unified Server Configurator

  1. Connect the keyboard, monitor, mouse, and any additional peripherals to your system
  2. Turn on the system and the attached peripherals.
  3. Press <F10> in the POST to start the System Services. The Initializing UEFI. Please wait… and the Entering System Services…Starting Unified Server Configurator messages are displayed.
  4. In the Unified Server Configurator window, if you want to configure hardware, diagnostics, or set changes, click the appropriate option. If no changes are required, press OS Deployment. => you can opt to start with a cleanly build VDisk. Which is best and should suffice. But is doesn’t. We’ll clean the disk later anyway later on in Step 14.
  5. In the Operating System Deployment window, click Deploy OS. The Configure or Skip RAID window is displayed. If Redundant Array of Independent Disks (RAID) is configured, the window displays the existing RAID configuration details.
  6. Select Go directly to OS Deployment. If RAID is not yet configured, configure it at this time.
  7. Click Next. The Select Operating System window is displayed with a list of compatible operating systems.
  8. Choose Microsoft Windows Server 2012 and click Next.NOTE: If Microsoft Windows Server 2012 is not listed, choose any other operating system
  9. Choose whether you want to deploy the operating system in UEFI or BIOS mode, and click Next => I do not get this choice if UEFI is already on in the BIOS settings
  10. In the Insert OS Media window, insert the Windows Server 2012 media and click Next.
  11. In the Reboot the System screen, follow the instructions on the screen and click Finish. If a Windows operating system is already installed on your system, the following message is displayed: Press any key to boot from the CD/DVD …Press any key to begin the installation. If you used a clean VDisk this is no issue
  12. In the Windows Setup screen, select the appropriate option for Language, Time and Currency Format, and Keyboard or Input Method.
  13. Click Next to continue.
  14. STOP => Select to REPAIR your system and launch a command line. Form there you start diskpart and run following commands on the disk where you want to deploy Windows Server 2012:
    • select disk 0
    • clean
    • convert gpt

      In my case this is Disk 0. This is what the installer should be able to do automatically with a clean disk any way but it doesn’t happen.

      Now DO NOT navigate to the X: root and launch setup again. Shut exit the repair console and shutdown the server.

  15. Start the server
  16. Press <F10> in the POST to start the System Services. The Initializing UEFI. Please wait… and the Entering System Services…Starting Unified Server Configurator messages are displayed. => DO NOT TOUCH ANYTHING ANYMORE. It will take longer than expected but you will boot into the installation of Windows 2012 again.
  17. In the Windows Setup screen, select the appropriate option for Language, Time and Currency Format, and Keyboard or Input Method.
  18. Click Next to continue.
  19. On the next page, click Install Now.
  20. In the Operating System Install screen, select the operating system you want to install. Click Next. The License Terms window is displayed, click Next.
  21. In the Which Type of Installation Do You Want screen, click Custom: Install Windows only (advanced), if it is not selected already.
  22. In the Where do you want to install Windows screen, specify the partition on which you want to install the operating system. To create a partition and begin installation:
    1. Click New
    2. Specify the size of the partition in MB, and click Apply. A Windows might create additional partition for system files message is displayed. => NOW THE UEFI partitions on the GPT disk are created Open-mouthed smile.
    3. Click OK.Select the newly-created operating system partition and click Next.
      The Installing Windows screen is displayed and the installation process begins. After the operating system is installed the system reboots. You must set the administrator password before you can log in for the first time
  23. In the Settings screen, enter the password, confirm the password, and click Finish.
    The operating system installation is complete.

image

Now, while this worked for me on the Dell R810 with BIOS 2.7.4,  I give no guarantees whatsoever. You’ll have to test it yourself or wait for the firmware update that is coming soon. Any way, perhaps it helps some of you out there!