Belgian TechDays 2013 Sessions Are On Line

Just a short heads up to let you all know that the sessions of the TecDays 2013 in Belgium are available on the TechNet site. The slide decks can be found on http://www.slideshare.net/technetbelux

In case you want to see my two sessions you can follow these links:

Now there are plenty more good sessions so I encourage you to browse and have a look. Kurt Roggen his session on PowerShell is a great one to start with.

Windows Server 2012 NIC Teaming Mode “Independent” Offers Great Value

There, I said it. In switching, just like in real life, being independent often beats the alternatives. In switching that would mean stacking. Windows Server 2012 NIC teaming in Independent mode, active-active mode makes this possible. And if you do want or need stacking for link aggregation (i.e. more bandwidth) you might go the extra mile and opt for  vPC (Virtual Port Channel a la CISCO) or VTL (Virtual Link Trunking a la Force10 – DELL).

What, have you gone nuts? Nope. Windows Server 2012 NIC teaming gives us great redundancy with even cheaper 10Gbps switches.

What I hate about stacking is that during a firmware upgrade they go down, no redundancy there. Also on the cheaper switches it often costs a lot of 10Gbps ports (no dedicated stacking ports). The only way to work around this is by designing your infrastructure so you can evacuate the nodes in that rack so when the stack is upgraded it doesn’t affect the services. That’s nice if you can do this but also rather labor intensive. If you can’t evacuate a rack (which has effectively become your “unit of upgrade”) and you can’t afford the vPort of VTL kind of redundant switch configuration you might be better of running your 10Gbps switches independently and leverage Windows Server 2012 NIC teaming in a switch independent mode in active active. The only reason no to so would be the need for bandwidth aggregation in all possible scenarios that only LACP/Static Teaming can provide but in that case I really prefer vPC or VLT.

Independent 10Gbps Switches

Benefits:

  • Cheaper 10Gbps switches
  • No potential loss of 10Gbps ports for stacking
  • Switch redundancy in all scenarios if clusters networking set up correctly
  • Switch configuration is very simple

Drawbacks:

  • You won’t get > 10 Gbps aggregated bandwidth in any possible NIC teaming scenario

Stacked 10Gbps Switches

Benefits:

  • Stacking is available with cheaper 10Gbps switches (often a an 10Gbps port cost)
  • Switch redundancy (but not during firmware upgrades)
  • Get 20Gbps aggregated bandwidth in any scenario

Drawbacks:

  • Potential loss of 10Gbps ports
  • Firmware upgrades bring down the stack
  • Potentially more ‘”complex” switch configuration

vPC or VLT 10Gbps Switches

Benefits:

  • 100% Switch redundancy
  • Get > 10Gbps aggregated bandwidth in any possible NIC team scenario

Drawbacks:

  • More expensive switches
  • More ‘”complex” switch configuration

So all in all, if you come to the conclusion that 10Gbps is a big pipe that will serve your needs and aggregation of those via teaming is not needed you might be better off with cheaper 10Gbps leverage Windows Server 2012 NIC teaming in a switch independent mode in active active configuration. You optimize 10Gbps port count as well. It’s cheap, it reduces complexity and it doesn’t stop you from leveraging Multichannel/RDMA.

So right now I’m either in favor of switch independent 10Gbps networking or I go full out for a vPC (Virtual Port Channel a la CISCO) or VTL (Virtual Link Trunking a la Force10 – DELL) like setup and forgo stacking all together. As said if you’re willing/capable of evacuating all the nodes on a stack/rack you can work around the drawback. The colors in the racks indicate the same clusters. That’s not always possible and while it sounds like a great idea, I’m not convinced.

image

When the shit hits the fan … you need as little to worry about as possible. And yes I know firmware upgrades are supposed to be easy and planned events. But then there is reality and sometimes it bites, especially when you cannot evacuate the workload until you’re resolved a networking issue with a firmware upgrade Confused smile Choose your poison wisely.

Saying Goodbye To Old Hardware Responsibly

Last year we renewed our SAN storage and our backup systems. They had been serving us for 5 years and where truly end of life as both technologies uses are functionally obsolete in the current era of virtualization and private clouds. The timing was fortunate as we would have been limited in our Windows 2012, Hyper-V & disaster recovery plans if we had to keep it going for another couple of years.

Now any time you dispose of old hardware it’s a good idea to wipe the data securely to a decent standard such as DoD 5220.22-M. This holds true whether it’s a laptop, a printer or a storage system.

We did the following:

  • Un-initialize the SAN/VLS
  • Reinitialize the SAN/VLS
  • Un-initialize the SAN/VLS
  • Swap a lot of disks around between SAN/VLS and disk bays in a random fashion
  • Un-initialize the SAN/VLS
  • Create new (Mirrored) LUNS, as large as possible.
  • Mounted them to a host or host
  • Run the DoD grade  disk wiping software against them.
  • That process is completely automatic and foes faster than we were led to believe, so it was not really such a pain to do in the end. Just let it run for a week 24/7 and you’ll wipe a whole lot of data. There is no need to sit and watch progress counters.
  • Un-initialize the SAN/VLS
  • Have it removed by a certified company that assures proper disposal

We would have loved to take it to a shooting range and blast the hell of of those things but alas, that’s not very practical Smile nor feasible Sad smile. It would have been very therapeutic for the IT Ops guys who’ve been baby sitting the ever faster failing VLS hardware over the last years.

Here’s some pictures of the decommissioned systems. Below are the two old VLS backup systems, broken down and removed from the data center waiting disposal. It’s cheap commodity hardware with a reliability problem when over 3 years old and way to expensive for what is. Especially for up and out scaling later in the life time cycle, it’s just madness. Not to mention that those thing gave us more issues the the physical tape library (those still have a valid a viable role to play when used for the correct purposes). Anyway I consider this to have been my biggest technology choice mistake ever. If you want to read more about that go to Why I’m No Fan Of Virtual Tape Librariesimageimage

To see what replaced this with great success go to Disk to Disk Backup Solution with Windows Server 2012 & Commodity DELL Hardware – Part II

The old EVA 8000 SANs are awaiting removal in the junk yard area of the data center. They served us well and we’ve been early customers & loyal ones. But the platform was as dead as a dodo long before HP wanted to even admit to that. It took them quite a while to get the 3Par ready for the same market segment and I expect that cost them some sales. They’re ready today, they were not 24-12 months ago. image

image

So they’ve been replaced with Compellent SANs. You can read some info on this on previous blogs Multi Site SAN Storage & Windows Server 2012 Hyper-V Efforts Under Way and Migration LUNs to your Compellent SAN

The next years the storage wares will rage and the landscape will change a lot. But We’re out of the storm for now. We’ll leverage what we got Smile. One tip for all storage vendors. Start listening to your SME customers a lot more than you do now and getting the features they need into their hands. There are only so many big enterprises so until we’re all 100% cloudified, don’t ignore us, as together we buy a lot of stuff to. Many SMEs are interested in more optimal & richer support for their windows environments if you can deliver that you’ll see your sales rise. Keep commodity components, keep building blocks and from factors but don’t use a cookie cutter to determine our needs or “sell” us needs we don’t have. Time to market & open communication is important here. We really do keep an eye on technologies so it’s bad to come late to the party.

Hyper-V Cluster Node Pause & Drain fails – Live Migrations fail with “The requested operation cannot be completed because a resource has locked status”

One night I was doing some maintenance on a Hyper-V cluster and I wanted to Pause and drain one of the nodes that was up next for some tender loving care. But I was greeted by some messages:

image

[Window Title]
Resource Status

[Main Instruction]
The requested operation cannot be completed because a resource has locked status.

[Content]
The requested operation cannot be completed because a resource has locked status.

[OK]

Strange, the cluster is up and running, none of the other nodes had issues and operational wise all VMs are happy as can be. So what’s up? Not to much in the error logs except for this one related to a backup. Aha …We fire up disk part and see some extra LUNs mounted + using “vssadmin list writers“ we find:

clip_image002

 

 

Writer name: ‘Microsoft Hyper-V VSS Writer’
…Writer Id: {66841cd4-6ded-4f4b-8f17-fd23f8ddc3de}
…Writer Instance Id: {2fa6f9ba-b613-4740-9bf3-e01eb4320a01}
…State: [5] Waiting for completion
…Last error: Unexpected error

Bingo! Hello old “friend”, I know you! The Microsoft Hyper-V VSS Writer goes into an error state during the making of hardware snapshots of the LUNs due to almost or completely full partitions inside the virtual machines. Take a look at this blog post on what causes this and how to fix fit. As a result we can’t do live migrations anymore or Pause/Drain the node on which the hardware snapshots are being taken.

And yes, after fixing the disk space issue on the VM (a SDT who’s pumped the VM disks 99.999% full) the Hyper-V VSS writer get’s out of the error state and the hardware provider can do it’s thing. After the snapshots had completed everything was fine and I could continue with my maintenance.