FREE WHITE PAPER: Configuring a VEEAM Off Host Backup Proxy Server for backing up a Windows Server 2012 R2 Hyper-V cluster with a DELL Compellent SAN (Fiber Channel)

Whilst I’m attending TechEd North America 2014, being able to learn and network again with the community at large I think this is a good moment to share. So here’s a little contribution to that community: it’s a white paper on How to configure a VEEAM Off Host Backup Proxy server for backing up a Windows Server 2012 R2 Hyper-V cluster with a DELL Compellent SAN (Fiber Channel).

VEEAM Back & Replication is currently under and extensive test before we make the decision. So far it is going (very) well. And no, VEEAM or DELL did not sponsor this. It’s sharing with the community. A prosperous, successful community makes my professional live better to!

I have to applaud VEEAM for allowing such easy access to their software for trials, to their engineers for assistance and to their support forum and resources even without yet being a paying customer. This is how it should be: vendors having faith in their products both in quality and ease of use. It’s a refreshing experience as some vendors don’t want you to get your hands on new versions of their products even as a existing paying customer “because due to its complexity we might get the wrong impression”. It’s even near impossible with some to get a test license for the lab of the version you currently use with some of them. Not so with VEEAM and that’s great.

I hope you enjoy it. As you might realize I don’t have this kind of infrastructure in my home lab so some of the screenshots have been edited / blurred. I’m sure you can live with that. Otherwise feel free to provide me with the gear in a paid for data center.

Use Cases For Fluid Cache For SAN With DELL Compellent In High Performance Virtualization With Windows 2012 R2

Fluid Cache For SAN

At Dell World 2013 in Austin Texas I spent some time talking to engineers & managers about Fluid Cache For SAN. The demo in the keynote was enough to grab my distinct attention, especially as a Compellent customer.

What is it?

Dell already has Fluid Cache for DAS available in its PowerEdge servers. Now it’s time to bring this to their best SAN offering, the Compellent, and make Fluid Cache shared storage suitable for shared storage clustering. The way to do that cost effective and high performance is to build on the success of on board (local to the server) high performance storage and make that shared through software in a physical shared nothing replication/sync model. To make this happen they use a 10/40Gbps Ethernet solution leveraging RoCE (RDMA over Converged Ethernet). Yes that very technology I have been investing time & effort in for SMB Direct and which we leverage for CSV & Live Migration traffic and with SOFS in Windows 2012 R2.

Basically the super low latency an high throughput enable the memory to be synced across all nodes in a cluster and as such each node sees all the cluster memory. For redundancy you will need at least 3 nodes in a cluster. Dell will scale Fluid Cache For SAN to 128 nodes. Windows Server 2012 R2 can handle 64 nodes, which some think is ridiculously high, but then again, Dell aims even higher so it’s not as weird as you think. Some people have really huge computing needs. Just remember that 10 years ago you probably found that 16GB of RAM was extravagant.

Why this architecture?

Dell uses server based “shared nothing flash storage” & high speed low, latency synchronization to create a logical cluster wide shared pool of flash memory. This means the achieve stellar low latency as the flash storage is inside of the servers, close to the processors and as such delivers excellent performance for the workloads. Way better that “just” flash only SAN can. For data integrity they commit the data only when it written to the Express Flash drive(s) of one server and then also to another and verified. This needs to happen very fast and that where the RoCE network come sin to play. Later, at less speed critical times the data is pushed out to the Compellent SAN for storage. If that SAN is a flash based setup think about the capabilities this gives you in performance. Likewise data reads of the SAN that are highly active are pushed from the Compellent SAN and cached (also in multiple copies) on the Express Flash modules. While two servers with each a copy of the data on Express Flash modules would suffice DELL requires at least three. This is just a plain common sense N+1 redundancy design to have high availability even when a node fails. A cool think to note is that you can build larger clusters with 3 nodes each having one or more Express Flash modules and additional nodes don’t need it as long as they can read the cache of those 3. So the cost of this can be managed. The drawback is that you don’t read & write to a local Express Flash module on those extra node. If you want that you’ll need to put more $ on the table.

clip_image001

The thing to note here is that the Servers/SAN are connected over RoCE/RDMA. Well this look familiar. What technology can also leverages RDMA? SMB Direct in Windows Server 2012 R2! And where do we use this amongst other things? Storage IO in Scale Out File Server, CSV traffic, Live Migration …

The big benefit of this design it just it takes your SAN to the next level but also, if DELL does this right, they won’t break any of the good stuff like VSS aware snapshot with Replay Manager, Automatic Data Tiering, Live Volumes, Live Migration etc. A lot of the high IOPS/low latency solutions out their based on fast local flash break a lot of the good stuff and reduces centralized storage management. What if you can have your cookie and eat it to?

Demo Time at Dell World

Dell demonstrated an Oracle database load on an eight node cluster of PowerEdge R720 servers with Intel Xeon E5 processors, with Linux (no Windows Server 2012 R2 support yet Sad smile) These servers each used 350GB PCI-Express flash cards (“only” PCI-Express 2.0 capable by the way). This cluster, using a Compellent SAN, managed to get a result of more than 5 million IOPS at 6 millisecond response times, delivering 12,000 tps for 14,000 client connections. This was read only. If they dropped the Fluid Cache for SAN they  could “only” achieve 2,000 clients (6 times less clients due to 4 time less transactions and 99% slower responses). See this movie for more info: http://www.youtube.com/watch?v=uw7UHWWAtig and watch the keynote from Dell World 2013 here

clip_image003

Where would I use this?

Cost will determine use cases and this is unknown for now. We can only look at what Fluid Cache for DAS cost right now and speculate. I for one hope/bet on the fact that DELL won’t price itself out of the marked (they have a lot of competition from big & smalls players in a “good enough is good enough” world with a cloud mindset all around). So make it too expensive and we might be happy with “just” 500.000 IOPS at much less cost. It’s a fine line. Price it right, support it well and you might win the bulk of sales in the storage wars. Based on the DAS solution we’re looking at least 8000 $ per server (license is 3500 for DAS => see http://www.theregister.co.uk/2013/03/05/dell_fluid_cache_server_acceleration/  + cost of PCI-Express flash module (> 5000$ => see http://en.community.dell.com/techcenter/extras/chats/w/wiki/4480.3-5-2013-techchat-fluid-cache-for-das.aspx) &  yearly maintenance fee. Then we need to factor in the cost of the RDMA/RoCE capable NICs & the (dedicated) Force10 switches – 2 for redundancy  that are at least 10Gbps (S4810?) or probably 40Gbps (S6000?) & cabling. So this is not a cheap solution and you won’t just “throw it in” on a quiet afternoon to see what it does for you. Not that there will be a DIY “throw it on kit” I think, it’s a step above plug and play. If they keep it affordable and do some other things for Windows Server 201 R2 / Hyper-V they can be the absolute number one SAN vendor for any Microsoft customer. But that’s another blog topic.

Cost is indeed something that might make it a show stopper for us. I just can’t tell yet. One of the key factors is that if affordable it could give the point solutions we now see pop up more and more in storage. a run for their money. While cheap and workable in good enough is good enough scenarios it takes some of the centrally shared storage advantages away. But if we ever do a state full VDI project in an environment with high end physical desktops (500GB or more local storage, SSD disks, 8 core CPU, 8-32GB DDR3, dual or more screens) that run ArcGis, AutoCad, Visual Studio, SQL Server, Outlook with 5GB mailboxes, large documents & huge files (images) this might be the enabler we need to make VDI happen & works as desired with current all-purpose Compellent SANs. IIf the price is right it could enable VDI in now “NO GO” scenarios.  And those are plentiful, … Another use case I see is a virtualized SQL Server environment on Hyper-V with general purpose shared storage. We’re doing very well but the day might arrive that we need those IOPS in order to take it even further. Don’t laugh but realize how much IOPS an SSD delivers to a workstation today and that’s what your users expect & demand. Want to fail at VDI? Have it outperformed by a 4 year old physical PC where you slapped an SSD into.

Could it help in keeping excessive IOPS away from the SAN, making that capable of doing more over a longer life time? In other words can it play a part in the Storage QoS issue across server/cluster/storage system issue for non workload aware storage solutions?

So I might have some homework to do. For our next SQL Server cluster we’ll look at the next generation of servers & start counting our PCI Express slots. We now already consume 4 PCI-Express slots for 2*FC & 2*Dual Port 10Gbps) in our Hyper-V design. That’s another discussion, but they are built purposely for performance under any condition & to be highly redundant. A health check / improvement track by Microsoft for our SQL Server environment has proven this to be an outstanding setup (nice e-mail to see your bosses get by the way). I digress, free PCI-Slots should not be an issue, as we also don’t need the FC cards in the Fluid Cache Nodes. The storage IO uses the RoCE network, to which the Compellent SAN attaches.

Cost is very important in determining if we’ll ever get to deploy it. The cloud is here, and while that is far from cheap either, it’s a lot easier to sell than internal IT for various reasons. That’s just how the powers that be roll right now & how things are.

What we’ll get in our hands

There was a lot of love between Dell & Samsung at Dell World. Talking to Dell at the server/storage/networking boots I understood that Samsung is going to produce flash modules for this that support PCI-Express 3.0 and the industry backed NVM Express host interface for solid state drives which will reduce latency with 1/3 compared to now. As it seems they will produce higher capacity cards than what was used in the demos (800 GB and 1.6 TB). So capacity will increase & latency will drop even more. They leverage the Force10 10Gpbs or 40Gbps switches for the RoCE network. As Dell & Mellanox are cooperating heavily (Mellanox Collaborates with Dell to Deliver 10/40GbE Solution for Mainstream Servers and Networking Solutions) my bet is on Mellanox for the cards. Broadcom is not there yet for it to happen in time and Intel has no RoCE cards afaik. They seem to be playing the waiting game before they jump in.

Magic Ball Time, Speculation & Questions.

I’m not a DELL Server / Storage designer or architect, and those that are don’t tell me to plaster it all over the internet, so this really is magic ball time …

image

I’ll show my ignorance on what Samsung does under the hood when I hear that the next generation of DELL servers can have 6TB of RAM I can only speculate that with the advent of DDR4 in servers & ever dropping cost the path is open to leverage NV-RAM disk for the read/write cache in Fluid Cache for SAN as well a bit like what IDT did http://us.generation-nt.com/idt-announces-world-first-pci-express-gen-3-nvme-nv-dram-press-3732872.html. The persistence comes from writing the DRAM content to NAND at shutdown, can we do that fast enough at 1.6 TB sized caches? Can we fit enough of  those modules on a card? What would that do for IOPS & latency? Does that even make sense at this moment in time?

What if we could leverage the DDR4 dims in the server itself? This would perhaps cut costs and also save us some valuable PCI-Express 3.0 slots for our 10Gbps or better addiction Smile. Sure there is no persistence than but the content is distributed redundantly over the cluster anyway? Is that safe enough to make it feasible? What if we need to shut down the cluster? I guess it’s not that easy and perhaps we just need to make sure future motherboards have 8 or more PC-Express 3.0 slots & not worry about that. Or move to 40/100Gbps & have less need for NICs. Yeah that’s what was said of 10Gbps in the early days …

Support for Windows?

While it’s not there yet I have absolutely no doubt that they will bring it to Windows Server 2012 R2 and higher. Well Windows is a huge on premise market for native workloads like SQL Server, VDI and Hyper-V. The number of sales opportunities in the Microsoft ecosystem is growing (despite cloud) while others are stagnant or dropping. On top of that the low cost of Hyper-V leaves money to be spent on Fluid Cache for SAN. As Dell is in business to make money, they will not leave that big chunk of cash on the table.

When can we get our hands on this technology?

Timing wise that will be early to late Q2 in 2014, which is my best guestimate. Interesting times people, interesting times

Great Hardware Support Equals Fast Windows 2012 R2 Implementation

I love it when a plan comes together

We adopted Windows 2012 when right after it went RTM in august 2012. Today we’re are already running Windows 2012 R2 and ready to step up the pace. If you are a VAR/ISV that does not have fast & good support for Windows Server 2012 R2 consider this your notice. You can’t lead from behind. Get your act together and take an example from Altaro. Small, sure, but good & fast. How do we get our act together so fast? Fast? Yes, but it does take time and effort.

As it turns out, we’re pretty well of with the DELL hardware stack. The generation 11 and 12 servers are supported by DELL and on the Windows Server Catalog for Windows Server 2012 R2.

image

For more information on Dell Server inbox driver support see: Windows Server 2012 R2 RTM Inbox Driver Support on Dell PowerEdge Servers. By the way I can testify that we’ve run Windows Sever 2012 R2 successfully on 9th Generation hardware (PowerEdge 1950/2950).

We’ve been running tests since Windows 2012 R2 Preview on R710/R720 and it has been a blast. We’ve kept them up to date with the latest firmware & drives via SUU. And for our Intel X520 and Mellanox ConnectX-3 we’ve had rapid support as well.

So what more could you want? Well support from your storage array vendor I would think. I’m happy to report that Storage Center 6.4 has been out since October 8th and it supports Windows Server 2012 R2. Dell Compellent Adds MLC SSD Tier – Bests 15K HDDs on Price and Performance. Mind you on a lazy Sunday afternoon 2 quick e-mails to CoPilot got me the answer that Storage Center 6.3.10 also supports Windows Server 2012 R2.  Sweet!

And that’s not just DELL, the Dell Compellent Storage Center 6.4 is fully Windows Server 2012 R2 logo certified! That’s what you want to see from you vendor. Fast & excellent support.

image

Here’s the entire DELL hardware line up with Windows Server 2012 R2 support. Happy upgrading & implementation! If you have Software Assurance you’re set to reap the benefits of that investment today!

To my all employers / clients, you see now, told you so. Now, I have a thing in common with Col. Hannibal Smith, I love it when a plan comes together Open-mouthed smile.

image

I know some of you think that all the testing, breaking, wrecking of Preview bits, RTM & GA versions we do looks like chaos. Especially when you visually add the test server & switch configurations. But that’s what it looks like to YOU. To the initiated this is well executed plan, dropping all assumptions, to establish what works & will hold up. The result is that we’re ready today and by extension, so are you.

Windows Server 2012 R2 Unmap, ODX On A Dell Compellent SAN Demo

UNMAP & ODX Video

Some things are easier to show using a video so have a look at a video on UNMAP/ODX used with Windows Server 2012 R2 and Compellent SAN:

You can also go directly to the Vimeo page by clicking on the below screen shotimage

We start out with a 10.5TB large thinly provisioned LUN that has about 203GB of space in use on the SAN. So the LUN on the SAN might be 10.5TB and windows sees a volume that is 10.5TB only the effective data stored consumes storage space on the SAN. That ought to demonstrate the principle of thin provisioning adequately Smile. The nice PowerShell counter is made possible via the Compellent PowerShell Command Set.

We then copy 42GB worth of ISO files inside a Windows Server 2012 virtual machine from a fixed VHD to a dynamically expanding VDHX. Those are nice speeds. And look at how the size of the VHDX file grows on the CSV volume and how the space used on the SAN is growing. That’s because the LUN is thinly provisioned.

Secondly we copy the same ISO files to a fixed size VHDX. Again, some really nice speeds. As the VHDX is fixed in size you do not see it grow. When looking at the little SAN counter however we do see that the thinly provisioned LUN is using more storage capacity.

Once that is done we see that the total space consumed on the SAN for that CSV LUN has risen to 284GB. We then delete the data from both dynamically expanding VHDX and are about to run the Optimize-Volume command when we notice that the SAN has already reclaimed the space. So we don’t run the optimize command. Keep that in mind. By the way, this process is done as part of standard maintenance (defrag) and some NTFS check pointing mechanism that’s run every 5 minutes and sends down the info from the virtual layer to the physical layer to the SAN. During demo’s it’s kind of boring to sit around and wait for it to happen Smile. Just remember that in real life it’s a zero touch feature, you don’t need to baby sit it.

We then also delete the ISO files from the fixed VHDX and run Optimize-Volume G –Retrim and as result you see the space reclaimed on the SAN. As this is a fixed disk the size of the VDHX will not change. But what about the dynamically expanding VHDX? Well you need to shut it down for that. But hey, nothing happens. So we fire it up again and do run Optimize-Volume H –Retrim before shutting it down again and voila.

So what do you need for this?

Rest assured. You don’t need the most high end, most expensive, complex and proprietary SAN hardware to get this done. What you need is good software (firmware) on quality commodity hardware and you’re golden. If any SAN vendor wants to charge you a license fee for ODX/UNMAP just throw them out. If they don’t even offer it walk away from them and just use storage spaces. There are better alternatives than overpriced SANs lacking features.

I’ve found that systems like Equalogic & Compellent are in the sweet point for 90 % of their markets based on price versus capabilities and features.  Let’s look at the a Compellent for example. For all practical intend this SAN runs on commodity hardware. It’s servers & disk bays. SAS to the storage & FC, iSCSI or SMB/NFS for access. With capable hardware the magic is in the software. Make no mistake about it, commodity hardware when done right, is very, very capable. You don’t need a special proprietary hardware & processors unless for some specialized nice markets. And if you think you do, what about buying commodity hardware anyway at 50% of the cost and replacing it with the latest of the greatest commodity hardware after 4 years and still come out on top cost wise whilst beating the crap out of that now 4 year old ASIC and reaping the benefits of a new capabilities the technology evolutions offers? Things move fast and you can’t predict the future anyway.