Disk to Disk Backup Solution with Windows Server 2012 – Part I

Backing Up 100 Plus Terabyte of Data Cheaply

When dealing with large amounts of data to backup you’re going to start bleeding money. Sure people will try to sell you great solutions with deduplication, but in a lot of scenarios this is not a very cost effective solution. The cost of dedupe in either backup hardware or software is very expensive and in some scenarios the cost cannot be justified. It’s also not very portable by the way unless in certain scenarios in which you stick with certain vendors. Once you get into backing up  > 100TB you need to forget about overly expensive hard & software. Just build your own solutions. Now depending on your needs you might want to buy backup software anyway but forget about dedupe licenses. Some of the more profitable hosting companies & cloud providers are not buying appliances or dedupe software either. They make real good money but they rather spend it on SUVs and swimming pools.

What Can You Do?

You can build your own solution. Really. You can put together some building blocks that scale up and out. You’ll a dual socket server with two 8 core CPUs and 24GB of ram, perhaps 32GB. Plug in some 6Gbps SAS controllers, hook those up to a bunch of 3.5” disk bays with 12 *2TB or 3TB disks each and you’re good to go. You can scale out to about 8 disk bays if you don’t cluster. Plug in a dual port 10Gbps card. You’ll need that as you be hammering that server. If you need more than this system, than scale out, put in a second, a third, etc. 3.5TB –4TB of backup capacity per hour in total should be achievable..

image

When you buy the components from super micro and some on line retailers you can do this pretty cheap. Spare parts you say? Buy some cold spares. You can have a dozen disk on the shelf, a SAS controller and even a shelf if you want. You could use hardware redundancy (RAID, hot spares) or use storage pools & spaces if you’re going the Windows Server 2012 route and save some extra money. Disk bay failure? Scale out so that even when you loose a node you still have tree others up and running. Spread backups around. Don’t backup the same data only to the same node. I know it’s not perfect for deduplication with Windows 2012 that way but hey, you win some, you lose some. Checks & balances right?  If you need a bit more support get some DELL PowerVaults or the like. It depends on what you’re comfortable with and how deep your pockets are.

You can by more storage than dedupe will ever save you & still come out with money to pay for the electricity. Okay it’s less good for the penguins but trust me, those companies selling those solutions would fry a penguin for breakfast everyday if it would make them money. Now talking about those penguins, the Windows Server 2012 deduplication feature could be providing me with the tools to save them Smile, but that’s for another post. I hope this works. I’d love to see it work. I bet some would hate to see it work. So much perhaps that they might even consider making their backup format non dedupableDevil?

Tip for users: Don’t use really cheap green SATA disks. They’re pretty environment friendly but the performance sucks. My view on “Green IT” is to right size everything, never to over subscribe and let that infrastructure work hard for you. This will minimize the hardware needed  and the performance is way better than all the power saving settings and green hardware. Which will ruin the environment anyway as you’ll end up buying more gear to compensate for lack of performance unless you’ll just suffer the bad performance. Keep the green disks for the home user’s picture, movie & music collection and use 2TB/3TB SAS/NL-SAS. Remember that when you don’t cluster (shared storage) you can make due nicely without the enterprise NL-SAS disks.

Now I’m not saying you should do what I suggest here, but you might find it useful to test this on your own scale for your own purposes. I did it for the money. For the money? Yup for the money. No not for me personally, I don’t have a swimming pool and I don’t even own a car, let alone an SUV. But saving your company a 100.000 or more in cash isn’t going to get you into trouble now is it? Or perhaps this is the only way you’re going to afford to back up that volume of data. People don’t throw away data and they don’t care about budgets you’d better be able to restore their data. Which reminds me, you will also need some backup software solutions that doesn’t cost an arm and a leg. That’s also a challenge as you need one that can handle large amounts of data and has some intelligence when I comes to virtualization, snapshots etc. It also has to be easy to use, as simple as possible as this helps ensure backups are made and are valid.

Are we trying to replace appliances or other solutions? No, we’re trying to provide lots of cheap and “fast enough” storage. Reading the data & providing it to the backup device can be an issue as well. Why fast enough? Pure speed on the target side is not useful if the sources can’t deliver. We need this backup space for when the shits really hits the fan and all else has failed.That doesn’t have to be a SAN crash or a SAN firmware issue ruing all your nice snapshots. It can also be the business detecting a mistake in a large data set a mere 14 months after the facts when all replicas, snapshots etc. have already expired. I’m sure you’ve got quality assurance that is so rock solid that this would never happen to you but hey, welcome to my world Sarcastic smile.

Why I’m No Fan Of Virtual Tape Libraries

After implementing a couple of SAN’s with backup solutions I have come to dislike Virtual Tape Libraries. This is definitely technology that, for us, has never delivered the promised benefits. To add insult to injury it is overly expensive and only good to practice hardware babysitting. When discussing this I’ve been told that I want things to cheap and that I should have more FTE to handle all the work. That’s swell but the business and the people with the budgets are telling me exactly the opposite. That explains why in the brochures it’s all about reduced cost with empty usage of acronyms like CAPEX and OPEX. But when that doesn’t really materialize the message to the IT Pros is to get more personnel and cash. In the best case (compared to calling you a whining kid) they‘ll offer to replace the current solution with the latest of the greatest that has, wonder oh wonder, reduced CAPEX & OPEX. Fool me once shame on you, fool me twice, shame on me.

So one thing I’m not planning to buy, ever again, is a Virtual Tape Library (VLS). Those things have a shelf life of about 2 years max. After that they turn into auto disintegrating pieces of crap you’ll spend babysitting for the rest of the serviceable live. This means regularly upgrading the firmware to get your LUN(s) back, if you get them back that is. This means convincing tech support to come up with a better solution than restarting from scratch when they acknowledge that their OS never cleans up its own log files and thus one day just kicks the bucket. Luckily they did go the extra mile on that one after we insisted and got a workaround without losing al backups. Babysitting also means that replacing the battery kits of all shelves becomes a new hobby. You become so good at it that you have better and faster way of doing it than the junior engineers they send who happily exclaim “so this is what it looks like”. The latter is not a confidence builder. The disks fail at a rate of 1 to 2 per week until you replaced almost all of them. Those things need to be brought down to fix just about anything. That means shutting down the disk shelf’s as well and cutting the power, not just a reboot so yes you need to be in the data center.

There is no RAID 6, no hot spares (global or otherwise). The disks cost an arm and a leg and have specialized hardware to make sure all runs fine and well. But in they are plain7.200 rpm cheap 500 GB SATA disks that cost way too much. The need for special firmware I can understand but the high cost I cannot. The amount of money you pay in support costs and in licensing the storage volume is more than enough to make a decent profit. Swapping disks and battery kits isn’t hard and we do it ourselves as waiting for a support engineer takes more time. We have spares at hand. We buy those when we by the solution. We’ve grown wise that way. We buy a couple units of all failure prone items at the outset of any storage project. Having only RAID 5 means that one disk failure puts you virtual tapes at a very high risk so you need to replace it as soon as possible. Once they shipped us a wrong disk, our VTL went down the drain due to incorrect firmware on disk. They demanded to know how it got in there. Well Mr. Vendor, you put it in yourself a as a replacement for a failed disk. In the first year it often happened they didn’t have more than 1 spare disk to ship. If anyone else has a VLS in your area you’re bound to hit that limit have to wait longer for parts. They must have upped the spare parts budget to have some more on hand just for us as we now get a steady supply in.

When you look at the complete picture the cost of storage per GB on a VLS is a much as on 1st tier SAN storage. That one doesn’t fly well. At least the SAN has decent redundancy and is luckily a 100 fold more robust and reliable. Why buy a VLS when you can have a premier tier SAN for the same cost, the VLS functionality? No sir that also never lived up to its promises. It has come to the point that the VTL, due to the underlying issues with the device, are more error prone than our Physical Tape Library. That’s just sad. Anyway, we never got the benefits we expected form those VTL. For disk based backup I don’t want a Virtual Tape Library System anymore. It just isn’t worth the cost and hassle.

Look you can buy 100 TB of SATA storage, put it in couple of super micro disk bays, add a 10Gbps Ethernet to your backup network and you’re good to go. Hey that even gives you RAID 6, the ability to add hot spares etc. You buy some spare disks, controllers, NICS, and perhaps even just build two of these setups. That would give you redundant backup to two times 100 TB for under 60.000 €. A VLS with 100TB, support and licensing will put you back the 5 fold of that. Extending that capacity costs an arm and a leg and you’re babysitting it anyway so why bleed money while doing that?

Does this sound crazy? Is this blasphemy? The dirty little secret they don’t like you to know is that’s how cloud players are doing it. First tier storage is always top notch, but if you talk about backing up several hundreds of terabytes of data, the backup solutions by the big vendors are prohibitively expensive. This industry looks a lot like a mafia racketeering business. Well if you don’t buy it you’ll get into trouble, you’ll lose your data. You won’t’ be able to handle it otherwise. Accidents do happen. The guys selling it even dress like mobsters in their suits. But won’t you miss out on cool things like deduplication? If your backup software supports it you can still have that. The licensing cost for this isn’t that bad a deal when compared to VLS storage costs. And do realize instead of 2100 TB you could make due to 2 25 TB. Hey that price even dropped even more.

When it comes to provisioning storage our strategy is to buy as much of your storage needs during the acquisition phase. That’s the only time deals can be made. The amount of discounts you’ll get will make you wonder what the markup in this market actually. It must be huge. And storage can’t cost as much as you think to build as they would like us to believe. Last time the storage sales guy even told us they were not making any money on the deal. Amazing how companies giving away their products have very good profit margins, highly paid employees with sales bonuses and 40.000 € cars If one vendor or reseller ever tells you that things will get cheaper with time, they are lying. They are actually saying their profit margins will increase during the life cycle of you storage solution. Look, all 1st tier storage is going to be expensive. The best you can hope for is to get a good quality product that performs as promised and doesn’t let you down. We’ve been fortunate in that respect to our SAN solutions but when it comes to backup solutions I’m not pleased with the state of that industry. Backups are extremely important live savers, but for some reason the technology and products remain very buggy, error prone, labor intensive and become very expensive when the data volumes and backup requirements rise.

Keeping The Same Server Name & IP Address When Moving A Domain Controller, DNS, DHCP & WINS Server To New Hardware

Introduction

At one place I provide Infrastructure consulting  & services the supportable life of some servers running Active Directory, DNS, WINS and DHCP has come and gone (+5 years old DELL PE1850) so I was asked to renew them. These are in fact the servers that where upgraded to Windows 2008 R2 some 16 months ago right after Windows 2008 R2 went RTM. See this blog post about that: https://blog.workinghardinit.work/2010/01/22/dell-pe1850-domain-controller-upgrade-to-windows-2008-r2/.

For several reasons not all domain controllers or core infrastructure servers should be virtualized so this was not just a matter or doing a P2V migration and be done with it. That would have made it very easy as not only are they already running on the latest Windows version but they had some extra requirements. They wanted to preserve the server names and IP addresses for these servers. Normally DHCP takes care of the DNS servers for clients. Servers that have static IP addresses without using DHCP reservations can be dealt with via scripting. But al lots of devices that need a DNS Server are a pain to deal with. There are devices that do not accept FQDNs, can’t be DCHP clients. Often a management solution for settings the IP configurations on hardware devices is not in place even when does tools exists (i.e. for network switches). Even if such software is in place more likely than not it doesn’t cover all devices (printers, copiers, scanners, load balancers, …) You’re lucky if they know and have a list of all devices that need their DNS entries changed, sometimes you don’t even have that. So I understand the reasons behind their request. In a perfect world this might not be needed but I’ve seldom see that in reality.

So the names and IP addresses needed to be preserved. That is can be done with some preparation and planning.  There is the approach of restoring the server to new hardware using a Windows backup but that’s not without its own issues. You could use one of the (paying) P2P migration tools that comes with various backup solutions or dedicated disaster recovery tools. To do it for free and in the cleanest possible manner I chose to go with the approach you’ll find below. There are multiple domain controllers with DNS, WISN and DHCP set up for redundancy. This means we can take these out of production one by one to recuperate the names and IP addresses. If you didn’t have that you could even introduce a temporary one (virtualized) to give you that option.

Imagine you have to servers named bert.big.corp (192.168.2.10/24) and ernie.big.corp (192.168.2.20) with Active Directory, DNS, WINS and DHCP running on them. These will be replaced with new hardware whilst maintaining the name and IP address. Basically you stage the new servers with Windows 2008 R2 and get everything installed what you need (Server management software from the hardware vendor, the UPS Software) and that doesn’t depend on computer name, IP address and domain membership. That way they are ready to go. You’ll need to wait to rename the servers and add them to the domain one by one until you’ve demoted the current servers one by one that use that name and IP address. So first you replace Bert and only after that has been completed successfully and has been verified you can replace server Ernie. That way you can recuperate the name and IP address without losing those services at any given time. The end result is two new servers with the name Bert (192.168.2.10) and Ernie (1692.168.2.20) and none of your network devices will need any DNS FQDNs or IP addresses changed. Do note that some devices only allow for one IP address or FQDN so they might lose DNS name resolution for the period in which the old server is taken out of service and the new server is brought into service. This is not the only way to do it. I do not like to rename domain controllers and I have other roles to migrate such as WINS and DHCP. So the DNS & Active Directory migration I do is not like the one described in http://technet.microsoft.com/cs-cz/library/dd379526(WS.10).aspx. Perhaps Microsoft can comment on the best practice of this all, but this is how I approached it. This is the first time I did this for Windows 2008 R2 in production as that hasn’t been out in the wild that long yet.

We’ll run through the process step by step for Bert. This needs to be done for every server that is being replaced. Check your Active Directory, DNS and replication is healthy. If not, fix this first. Also check out DHCP and WINS health. As always you have a good backup of the server and it roles before you attempt any of this.

Mind you can play the order of things to do according to you needs & environment but when using some option in the Server Migration Tools like –Users and –Groups you should pay attention unless you want to wind up with a truckload of users/groups from a DC created as local users and groups on a member server. Oh yeah, I’ve done that. When that happens you’ll love the fact that you know some scripting . So if you do the DHCP migration, do both export and import from a domain controller or from a member server (i.e. after demoting the domain controller). In the former case you’ll just get a bunch of warnings during the import that the users/groups already exist and are therefore skipped.

Decommissioning The Old Server

Demote the domain controller

Move all FSMO roles to another domain controller.  Make sure you also move the time server to the domain controller with the PDC role.

  • If you’ve manually altered the Active Directory replication topology, make sure you take care of that as well. If you let the KCC handle this, you’re good to go as it will do this for you. But once you intervene manually it’s your responsibility.
  • We demote the domain controller
  • Reboot and check all went well.
  • We wait for AD replication or we induce it ourselves.
  • Check AD, DNS & replication, if all is well continue.
DHCP Decommission On The Old Server

There is a great tool to do this. Check out TechNet http://technet.microsoft.com/nl-be/library/dd379545(WS.10).aspx. The benefit of using the Server Migration Tools  is that we can also migrate the IP configuration from the old server to the new one in one step! Pretty cool Open-mouthed smile! The details differ depending what version of Windows you’re running. I’m at Windows 2008 R2 so I can install these tools via an elevated PowerShell prompt using

  • Import-Module ServerManager
  • Add-WindowsFeature Migration (to install the Server Migration Tools)
  • We then take a DHCP backup (GUI or via an elevated command prompt using  Netsh DHCP server backup “c:DHCPBackup”). Make sure the folder structure exists and is empty. Copy the folder structure of to a file share for safe keeping.  Remember you want to give yourself options.
  • Export the DCHP configuration and content using an elevated powershell prompt:
    • Get-Service DHCPServer (running)
    • Stop-Service DHCPserver
    • Get-Service DHCPServer (stopped)
    • Add-PSSnapin Microsoft.Windows.ServerManager.Migration (load the Server Migration commandlets)
    • Export-SmigServerSetting -featureID DHCP -Group -IPConfig -path C:DHCPservermig  -Verbose (not we are working on a DC so we have no local users and hence no use for the –user switch, see TechNet http://technet.microsoft.com/cs-cz/library/dd379483(WS.10).aspx) and copy the folder structure to a file share for safe keeping. You’ll be asked to provide a password for the exported files. You’ll get feedback due to the –verbose switch about the results. Review these and act accordingly. The below screenshot is an illustration of the output.

  • Unauthorize the DHCP Server using an elevated command prompt: Netsh DHCP delete server bert.big.corp  192.168.2.10
  • If you want you can also uninstall the DHCP server role.
WINS Decommission On The Old Server

We won’t be migrating WINS. We’ll just replicate from the other WINS server(s). But we’re a careful bunch and you have to give yourself multiple options to get out of bad situations, as the saying goes: “one is none, two is one”. Do make sure that you adjust the WINS replication topology if you have more than 2 servers so that all of them get the required registration changes. And just in case you ask, yes I know about the Windows 2008 R2 option to dump WINS and use DNS, but they did not need/want this.

  • Take a WINS backup (GUI) and copy the folder structure of to a file share for safe keeping.
  • Stop the WINS service, disable it and copy the WINS database (mdb) over to a file share for safe keeping as well, just in case.
  • Remove the server as WINS replication partner where it was added. It remains in there when you uninstall WINS (old technology). You can actually leave it in there in this case, as a server with that name will be back.
  • If you want you can already uninstall the WINS feature. This requires a reboot.

That’s it we have now successfully stripped the old server Bert from all its previous roles. We can now change the IP address of the old server to 192.168.2.110 and rename the server to BertOld. I give it a restart and check to see if the DNS record is updated. if not a swift restart of the netlogon service might help. It is now a plain member server in the domain that can be taken out of service at later. This means we are ready to bring the new server Bert to life.

Commissioning The New Server

Commissioning DHCP on the new server
  • Rename the new server to Bert.
  • Restart the server.
  • Make Bert a member of the domain.
  • Restart the server.
  • We now have a plain member server in the domain, named Bert on which we also install the Windows Migration Tools using an elevated PowerShell prompt:
    • Import-Module ServerManager
    • Add-WindowsFeature Migration
  • We copy the folder structure from the file share where we earlier copied the exported older DHCP settings and data, back to server Bert, i.e. C:DHCPservermig
  • We import the IP configuration plus the DCHP configuration & database using the following commandlet: Import-SmigServerSetting -featureid DHCP -Group -IPConfig NIC -SourcePhysicalAddress 00-14-72-7D-B3-C3 -TargetPhysicalAddress 44-2C-2B-AF-67-D2 -Force -path C:DHCPservermig –Verbose
  • Start-Service DHCPServer
  • Via an elevated command prompt we authorize Bert as a DHCP server using “netsh DHCP add server bert.big.corp 192.168.2.10”

Voila, we have Bert running as a DHCP server already!

Commissioning Wins on the new server

We Install WINS and just add the desired replication partners and configure the WINS replication schedule and option to your liking. This is the “cleanest” way. A backup restore is also an option but it could transplant old issues.

Manual trigger Replication and check if you are getting the entries you need (netstat –RR, or via the WINS GUI)

Please note that the browser service is not running by default on Windows 2008(R2) so if you need this for you clients you should set the service to automatic and start it.

Commissioning Active Directory and DNS on the new server
  • Make sure you can contact an existing domain controller before promote Bert to become a domain controller, this can be assured by having the first DNS server in your IP Configuration point to an operational DNS Server in the domain.
  • Promote Bert to a domain controller. By default the DNS option is enabled. Take a look at http://technet.microsoft.com/cs-cz/library/dd379526(WS.10).aspx that will walk you through the wizard.
  • Check the health of you Active Directory, replication and name resolution once more and keep a close eye on it the following days. You should be good to go.

Conclusion

Voila, a new Bert server is born. Not too bad and a happy customer. Rinse and repeat for every domain controller you want to replace.  Do make sure you check out the documentation on the Server Migration Tool and its options. Also, if you are running Windows 2008(R2) and got there by upgrading from Windows 2000/2003 consider moving from FRS to DFRS for SYSVOL replication. As mentioned is a previous blog post of mine (https://blog.workinghardinit.work/2011/01/06/dcdiag-exe-problem-on-windows-2008r2-verifyenterprisereferences-indicates-problem-missing-expected-value-points-to-knowledge-base-article-q312862/) there is an absolute brilliant step by step guide to get the move from FRS to DFRS completed without a problem in a series by the storage team at Microsoft . You can find the first of a 5 part blog series over here http://blogs.technet.com/b/filecab/archive/2008/02/08/sysvol-migration-series-part-1-introduction-to-the-sysvol-migration-process.aspx. I prefer to have this done before I introduce the new domain controllers. This helps me with my neurotic urge to have a SYSVOL folder named SYVOL and not SYSVOL-DFRS clip_image002[1]

Windows 2008 R2: The system image restore failed. Error details: The parameter is incorrect. 0x80070057

Note to self: read your own blogs on Windows 2008 R2 Native Backup :-). Yes people, Windows 2008 R2 Bare Metal restore to dissimilar hardware does work as long as you follow the rules and guidelines. Those are not super evidently documented but still, if I can find ‘m you can too! But today we lost some time because we didn’t head one of the rules that trip people up frequently. That rule is that the disk layout on the restore server can’t differ from the original one. I literally wrote “Pay close attention to the disk layout/ boot order as well, the restore doesn’t allow for variation from the original layout” in https://blog.workinghardinit.work/2010/01/27/using-windows-2008-r2-backups-to-go-virtual-2/. That means you need to simulate the same disk layout on the new hardware. If the new server has an extra disk, disable that one for the restore, if it has one less, add one. Another situation where the disk layout comes into play is when you boot from an USB stick with W2K8R2. If you leave it plugged in there during the restore the recovery will fail. Because if that extra attached disk isn’t the one containing the backup image you’ll get a very harsh error:

“The system image restore failed. Error details: The parameter is incorrect. 0x80070057”

image

Not very helpful in explaining but that generally means you’ve got a disk layout issue. In this case because you have the bootable USB stick attached. Once you’ve booted to the “Repair your computer” functionality, selected “Select a system image backup” and found your image to restore you should remove the bootable USB stick from the server if you’re not going to be doing an install. Beware of this! Typically when you boot from DVD or PXE you wouldn’t even notice but when using a bootable USB device with W2K8R2 you might forget that this changes the disk layout. So again, always pull the bootable USB stick from the server before you restore and you’ll be fine. Yes the recovery will work a soon as you’ve booted, you don’t need the media anymore so you can unplug it safely. You can even attach another USB disk in its place containing the backups if you only have one USB port available. That will work because the disk with the backup itself is never taken into consideration and won’t cause any issues with the restore.

So we’ll never forget to head our own warnings again (I hope). The good thing is we had some refresh training on restoring today and it’s all refreshed in our minds 🙂