Some SAN Storage Fun

At the end of this day I was doing some basic IO tests on some LUNs on one of the new Compellent SANs. It’s amazing what 10 SSDs can achieve … We can still beat them in  certain scenarios but it takes 15 times more disks. But that’s not what this blog is about. This is about goofing off after 20:00 following another long day in another very long week, it’s about kicking the tires of Windows and the SAN now that we can.

For fun I created a 300TB LUN on a DELL Compellent, thin provisioned off cause, I only have 250 TB Smile

I then mounted it to a Windows 2008 R2 test server.

image

The documented limit of a Volume in Windows 2008 R2 is 256TB when you use 64K allocation size. So I tested this limit by trying to format the entire LUN and create a 300TB simple volume. I brought it online, initialized it to an GPT disk, created a simple volume with an allocation unit size of 64K and well that failed with following error:

Failed Format300TB

There is nothing unexpected about this. This has to do with the maximum NTFS volume size supported on a GPT disk. It depends on the cluster size that is selected at the time of formatting. NTFS is currently limited to 2^32-1 allocation units. This yields a 256TB volume, using 64k clusters. However, this has only been tested to 16TB, or 17,592,186,040,320 bytes, using 4K cluster size. You can read up on this in Frequently asked questions about the GUID Partitioning Table disk architecture. The table below shows the NTFS limits based on cluster size.

image

This was the first time I had the opportunity to test these limits I formatted part of that LUN to a size close to the limit and than formatted the remainder to a second simple volume.

image

I still need get a Windows Server 2012 test server hooked up to the SAN. To see if anything has changed there. One thing is for sure, you could put at least 3 64TB VHDX files on a single volume in Windows. Not too shabby Smile. It’s more than enough to put just about any backup software into problems. Be warned, MSFT tested and guarantees performance & behavior up to 64TB in Windows Server 2012, but beyond that you’d better do your own due diligence.

The next thing I’ll do when I have a Windows Server 2012 host hooked up is, is create 64TB VHDX file and see if I can go beyond it before things break. Why, well because I can and I want to take the new SAN and Windows 2012 for a ride to see what boundaries we can push. The SANs are just being set up so now is the time to do some testing.

Some Thoughts Buying State Of The Art Storage Solutions Anno 2012

Introduction

I’ve been looking into storage intensively for some time. At first it was reconnaissance. You know, just looking at what exist in software & hardware solutions. At that phase it was pure and only functionality wise as we found our current more traditional SANs a dead end.

After that there was the evaluation of reliability, performance and support. We went hunting for both satisfied and unsatisfied customers, experiences etc.  We also considered whether a a pure software SAN on commodity hardware would do for us or whether we still need specialized hardware or at least the combination of specialized software on vendor certified and support commodity hardware. Yes even if you have been doing things a certain way for a longer time and been successful with is it pays to step back and evaluate if there are better ways of doing it. This prevents tunnel vision and creates awareness of what’s out there that you might have missed.

Then came the job of throwing out vendors who we thought couldn’t deliver what was needed and /or who have solutions that are great but just to expensive. After that came the ones whose culture, reputation was not suited for or compatible with our needs & culture. So that big list became (sort of) a  long list, which eventually became a really short list.

There is a lot of reading thinking, listening, discussing done during these phases but I’m having fun as I like looking at this gear and dreaming of what we could do with it. But there are some things in storage world that I found really annoying and odd.

Scaling Up or Scaling Out with High Availability On My mind

All vendors, even the better ones in our humble opinion, have their strong and weak points. Otherwise they would not all exist. You’ll need to figure out which ones are a good or the best fit for your needs. So when a vendor writes or tells me that his product X is way above others and that those others their product Z only competes with the lower end Y in his portfolio I cringe. Storage is not that simple. On the other hand they sometimes over complicate straightforward functionality or operational needs if they don’t have a great solution for it. Some people in storage really have gotten trivializing the important and complicating the obvious down to an art. No brownie points for them!

One thing is for sure, when working on scalability AND high availability things become rather expensive. It’s a bit like the server world. Scale up versus scale out. Scaling up alone will not do for high availability except at very high costs. Then you have the scalability issue. There is only so much you can get out of one system and the last 20% become very expensive.

So, I admit,  I’m scale out inclined. For one, you can fail over to multiple less expensive systems and if you have an “N+1” scalability model you can cope with the load even when losing a node. On top of that you can and will use this functionality in your normal operations. That means you know how it works and that it will work during a crisis. Work and train in the same manner as you will when the shits hits the fan. It’s the only way you’ll really be able to cope with a crisis. Remember, chances are you won’t excel in a crisis but will fall back to you lowest mastered skill set.

Oh by the way, if you do happen to operate a nuclear power plant or such please feel free to work both fronts for both scalability & reliability and then add some extra layers. Thanks!

Expensive Scale Up Solutions On Yesterday’s Hardware?

I cannot understand what keeps the storage boys back so long when it comes to exploiting modern processing power. Until recently they all stilled lived in the 32 bit world running on hardware I wouldn’t give to the office temp. Now I’d be fine with that if the prices reflected that. But that isn’t the case.

Why did (does) it take ‘m so long to move to x64 bit? That’s been our standard server build since Windows 2003 for crying out loud and our clients have been x64 since the Vista rollout in 2007. It’s 2012 people. Yes that’s the second decade of the 21st century.

What is holding the vendors back from using more cores? Realistically, if you look at what’s available today, it is painful to see that vendors are touting the dual quad core controllers (finally and with their software running x64 bit) as their most recent achievement. Really, dual Quad core, anno 2012? Should I be impressed?

What’s this magic limit of 2 controllers with so many vendors? Did they hard code a 2 in the controller software and lost the source code of that module?

On the other hand what’s the obsession with 4 or more controllers? We’re not all giant cloud providers and please note my ideas on scale out versus scale up earlier.

Why are some spending time and money in ASIC development for controllers? You can have commodity motherboard with for sockets and 8, 10, 12 cores. Just buy them AND use them. Even the ones using commodity hardware (which is the way to go long term due to the fast pace and costs) don’t show that much love for lots of cores. It seems cheap and easy, when you need a processor upgrade or motherboard upgrade. It’s not some small or miniature device where standard form factors won’t work. What is wrong in your controller software that you all seem to be so slow in going that route? You all talk about how advanced, high tech, future tech driven the storage industry is, well prove it. Use the 16 or to 32 cores you can easily have today. Why? Because you can use the processing powers and also because I promise you all one thing: that state of the art newly released SAN of today is the old, obsolete junk we’ll think about replacing in 4 years time so we might not be inclined to spend a fortune on it Winking smile. Especially not when I have to do a fork lift upgrade. Been there, done that and rather not do it again. Which brings us to the next point.

Flexibility, Modularity & Support

If you want to be thrown out of the building you just need to show even the slightest form of forklift upgrade for large or complex SAN environments. Don’t even think about selling me very expensive highly scalable SANs with overrated and bureaucratic support. You know the kind where the response time in a crisis is 1/10 of that of when an ordinary disk fails.

Flexibility & Modularity

Large and complex storage that cost a fortune and need to be ripped out completely and/or where upgrades over its life time are impossible or cost me an arm and a leg are a no go. I need to be able to enhance the solution where it is needed and I must be able to do so without spending vast amounts of money on a system I’ll need to rip out within 18 months. It has more like a perpetual, modular upgrade model where over the years you can enhance and keep using what is still usable .

If that’s not possible and I don’t have too large or complex storage needs, I’d rather buy a cheap but functional SAN. Sure it doesn’t scale as well but at least I can throw it out for a newer one after 3 to 4 years. That means I can it replace long before I hit that the scalability bottleneck because it wasn’t that expensive. Or if I do hit that limit I’ll just buy another cheap one and add it to the mix to distribute the load. Sure that takes some effort but in the end I’m better and cheaper off than with expensive, complex highly scalable solutions.

Support

To be brutally honest some vendors read their own sales brochures too much and drank the cool aid. They think their support processes are second to none and the best in the business. If they really believe that they need to get out into the field an open up their eyes. If they just act like they mean that they’ll soon find out when the money starts talking. It won’t talk to you.

Really some of you have support process that are only excellent and easy in your dreams. I’ll paraphrase a recent remark on this subject about a big vendor “If vendor X their support quality and the level of responsiveness what only 10% of the quality of their hardware buying them would be a no brainer”. Indeed and now that fact it’s a risk factor or even a show stopper.

Look all systems will fail sooner or later. They will. End of story. Sure you might be lucky and never have an issue but that’s just that. We need to design and build for failure. A contract with promises is great for the lawyers. Those things combined with the law are their weapons on their battle field. An SLA is great for managers & the business. These are the tools they need for due diligence and check it off on the list of things to do. It’s CYA to a degree but that is a real aspect of doing business and being a manger. Fair enough. But for us, the guys and gals of ICT who are the boots on the ground, we need rock solid, easy accessible and fast support.  Stuff fails, we design for that, we build for that. We don’t buy promises. We buy results. We don’t want bureaucratic support processes. I’ve seen some where the overhead is worse than the defect and the only aim is to close calls as fast as they can. We want a hot line and an activation code to bring down the best support we can as fast as we can when we need it. That’s what we are willing to pay real good money for. We don’t like a company that sends out evaluation forms after we replaced a failed disk with a replacement to get a good score. Not when that company fails to appropriately interpret a failure that brings the business down and ignores signals from the customer things are not right. Customers don’t forget that, trust me on this one.

And before you think I’m arrogant. I fail as well. I make mistakes, I get sick, etc. That’s why we have colleagues and partners. Perfection is not of this world. So how do I cope with this? The same way as when we designing an IT solution. Acknowledge that fact and work around it. Failure is not an option people, it’s pretty much a certainty.That’s why we make backups of data and why we have backups for people. Shit happens.

The Goon Squad Versus Brothers In Arms

As a customer I never ever want to have to worry about where your interests are. So we pick our partners with care. Don’t be the guy that acts like a gangster in the racketeering business. You know they all dress pseudo upscale to hide the fact they’re crooks. We’re friends, we’re partners. Yeah sure, we’ll do grand things together but I need to lay down the money for their preferred solution that seems to be the same whatever the situation and environment.

Some sales guys can be really nice guys. Exactly how nice tends to depend on the size of your pockets. More specifically the depth of your pockets and how well they are lined with gold coin is important here. One tip, don’t be like that. Look we’re all in business or employed to make money, fair enough, really. But if you want me be your client, fix my needs & concerns first. I don’t care how much more money some vendor partnerships make you or how easy it is to only have to know one storage solution. I’m paying you to help me and you’ll have to make your money in that arena. If you weigh partner kickbacks higher than our needs than I’ll introduce you to the door marked “EXIT”. It’s a one way door. If you do help to address our needs and requirements you’ll make good money.

The best advisors – and I think we have one – are those that remember where the money really comes from and whose references really matter out there. Those guys are our brothers in arms and we all want to come out of the process good, happy and ready to roll.

The Joy

The joy simply is great, modern, functional, reliable, modular, flexible, affordable and just plain awesome storage. What virtualization /Private cloud /Database /Exchange systems engineer would mind getting to work with that. No one, especially not when in case of serious issues the support & responsiveness proves to be rock solid. Now combine that with the Windows 8 & Hyper-V 3.0 goodness coming and I have a big smile on my face.

Video Interview on CSV & Storage Design by Carsten Rachfahl

I already mentioned that during the Experts2Experts Virtualization Conference I met a lot of great people and I presented on High Performance & High Availability networking for Hyper-V clusters (10Gbps goodness). Some of the people I met I already knew from the on line community and others were unknown to me until that event. Among the attendees we found some of the usual virtualization suspects in our community like Aidan Finn, Jeff Wouters, Carsten Rachfahl, Ronnie Isherwood.

Now Carsten Rachfahl is a MVP in Virtual Machine expertise but he’s also a dynamic entrepreneur who shows a lot of initiative. Using social media he is really making in effort to get people & customers to notice important snippets of information by providing easy and fast access to them. He’s very active as a speaker, on Twitter and on his blogs. On top of that he does podcasts and video interviews. For Hyper-V information go to http://www.hyper-v-server.de/  which you can also use  as an entry point for his other sites focusing on several aspects of IT in the Microsoft sphere in Germany. Like cloud computing & Licensing. There you’ll also find the videos of interviews on these subjects. It’s quite an impressive endeavor.

Carsten took the opportunity to make some videos with all the above suspects on various subject and he recently released our interview. 2011-12-01-didier-interview

In this video we continued the discussion that Aidan started on CSV and we briefly touched on a subject you could make hour long documentaries about: storage options in Windows Hyper-V now and in the years to come. Enjoy!

BriForum Europe 2011 & The Experts Conference Europe 2011

Great news from the educational & conference front. First of all, I’m attending BriForum in London, United Kingdom in May (http://briforum.com/Europe/index.html).  That’s good news, normally we’d have to pop over the big pond to go to that one, so this is pretty neat. And timely, due to some prospecting I’m doing for Disaster Recovery,  Business continuity, application aware storage in a virtualized environment It’s a good match and I hope to get in to some educational discussions about the challenges we all face. Some of the storage vendors we’re interested in are there as well so there is certainly some potential to make it a good experience.

And just recently confirmed that The Experts Conference is coming to Europe. TEC2011 Europe will be held in Frankfurt, Germany from October 17th to October 19th 2011. This conference is high quality and created to fill the needs of the most experienced users, which is one of the reasons I would like to attend. The more you learn & grown the more you bump into the next level of challenges and being able to learn form high level content and interact with experienced speakers and attendees who are dealing with the same issues can be very rewarding. Attendees of TechEd have a way to measure the level of the sessions, well, they are all supposed to be Level 400 only. Quest is hosting this, so they certainly should be able to round up the expertise.  I’m going to make it to the new “track” at this conference and that’s “Virtualization & Cloud”. More information can be found here http://www.theexpertsconference.com/europe/2011/virtualization-cloud-training/overview/

The timing of these conferences is pretty good. As I said we’re doing a lot of prospecting right now and hope to get a lot of information from attending these. For anyone interested why I attend conferences and why I think they are valuable see mu blog post on this subject https://blog.workinghardinit.work/2010/06/05/why-i-find-value-in-a-conference/