DELL Server DRAC Card Soft Reset With Racadmin

Sometimes a DRAC goes BOINK

Sometimes a DRAC (Dell Remote Access Card) can give you issues. Sometimes it’s some lingering process or another hiccup that causes this. You can try a reboot but that doesn’t always fix the issue. You can go into the BIOS and cancel any running System Services. A “confused” DRAC card can also be fixed by shutting down the server and cutting power for 5 to 10 minutes. That’s good to know as a last resort but not very feasible a lot of times, bar a maintenance window when you’re on premise.

You can also try to do a local or a remote reset of the DRAC card via OpenManage  (OMSA), racadmin. See RACADM Command Line Interface for DRAC for more information on how and when to use this tool. The racadmin can be used for a lot of remote configuration and administration and one of those is a “soft reset” or basically a powercycle, aka reboot, of the drac card itself. Don’t worry your server stays up Smile.

Local: racadmin racreset soft

Remote: racadm -r <ip address> -u <username> -p <password> racreset soft

Real life example

I was doing routine maintenance on 4 Hyper-V clusters and as part of that DUPs (Dell update packages) were being deployed to upgrade some firmware. This can be automated nicely via Cluster Aware Updating and the logging option will help you pin point the issue. See https://blog.workinghardinit.work/2013/01/09/logging-cluster-aware-updating-hotfix-plug-in-installations-to-a-file-share/ for more information on this.

Just like we found that the DRAC upgrade was not succeeding on two nodes.

One it was due to the DUP not being able to access the Virtual USB Device

Software application name: iDRAC6
   Package version: 1.95
   Installed version: 1.92

Executing update…

Device does not impact TPM measurements.

Device: iDRAC6, Application: iDRAC6
  Failed to access Virtual USB Device

==================> Update Result <==================

Update was not applied

================================================

Exit code = 1 (Failure)

and the other was because there was some other lingering DRAC process.

 iDRAC is currently unable to process this request because of another task.
  Please attempt one or more of the following steps to cancel the pending iDRAC task:
  1) Wait 30 minutes and retry your request.
  2) Reboot the system; Press F10; select ‘Exit and Reboot’ from Unified Server Configurator, and retry your request.
  3) Reboot the system; Press Ctrl-E; select ‘System Services’. Then change ‘Cancel System Services’ to YES, which will close the pending task;
      Then press Enter at the warning message. Press ESC twice and select ‘Save Changes and Exit’ and retry your request.

==================> Update Result<==================

Update was not applied

================================================
Exit code = 1 (Failure)

They give some nice suggestions but the racreset is another nice one to have I your toolkit. It’s fast and effective.

Run racadmin racreset soft

image

Wait for a couple of minutes and then run the DUP or the items in SUU that failed. With some luck this will succeed now.

image

Great Hardware Support Equals Fast Windows 2012 R2 Implementation

I love it when a plan comes together

We adopted Windows 2012 when right after it went RTM in august 2012. Today we’re are already running Windows 2012 R2 and ready to step up the pace. If you are a VAR/ISV that does not have fast & good support for Windows Server 2012 R2 consider this your notice. You can’t lead from behind. Get your act together and take an example from Altaro. Small, sure, but good & fast. How do we get our act together so fast? Fast? Yes, but it does take time and effort.

As it turns out, we’re pretty well of with the DELL hardware stack. The generation 11 and 12 servers are supported by DELL and on the Windows Server Catalog for Windows Server 2012 R2.

image

For more information on Dell Server inbox driver support see: Windows Server 2012 R2 RTM Inbox Driver Support on Dell PowerEdge Servers. By the way I can testify that we’ve run Windows Sever 2012 R2 successfully on 9th Generation hardware (PowerEdge 1950/2950).

We’ve been running tests since Windows 2012 R2 Preview on R710/R720 and it has been a blast. We’ve kept them up to date with the latest firmware & drives via SUU. And for our Intel X520 and Mellanox ConnectX-3 we’ve had rapid support as well.

So what more could you want? Well support from your storage array vendor I would think. I’m happy to report that Storage Center 6.4 has been out since October 8th and it supports Windows Server 2012 R2. Dell Compellent Adds MLC SSD Tier – Bests 15K HDDs on Price and Performance. Mind you on a lazy Sunday afternoon 2 quick e-mails to CoPilot got me the answer that Storage Center 6.3.10 also supports Windows Server 2012 R2.  Sweet!

And that’s not just DELL, the Dell Compellent Storage Center 6.4 is fully Windows Server 2012 R2 logo certified! That’s what you want to see from you vendor. Fast & excellent support.

image

Here’s the entire DELL hardware line up with Windows Server 2012 R2 support. Happy upgrading & implementation! If you have Software Assurance you’re set to reap the benefits of that investment today!

To my all employers / clients, you see now, told you so. Now, I have a thing in common with Col. Hannibal Smith, I love it when a plan comes together Open-mouthed smile.

image

I know some of you think that all the testing, breaking, wrecking of Preview bits, RTM & GA versions we do looks like chaos. Especially when you visually add the test server & switch configurations. But that’s what it looks like to YOU. To the initiated this is well executed plan, dropping all assumptions, to establish what works & will hold up. The result is that we’re ready today and by extension, so are you.

Fixing A Little Quirk In Dell Compellent Replay Manager

If you’re running a DELL Compellent SAN you’re probably familiar with Replay Manager. It’s Compellent’s solution to take VSS based (and as such application consistent) snapshots.image

When you’re running Replay Manager you might run into the following issue when trying to access a host.image

Every time, you access a host for the first time after opening Replay Manager you’ll be prompted for your password, even if you select Remember my password. You don’t need to retype it so that’s fine, but you do need to click it.

In the system log you’ll see the below error logged.image

Log Name:      System
Source:        Microsoft-Windows-Security-Kerberos
Date:          7/08/2013 9:55:43
Event ID:      4
Task Category: None
Level:         Error
Keywords:      Classic
User:          N/A
Computer:      replayserver.test.lab
Description:
The Kerberos client received a KRB_AP_ERR_MODIFIED error from the server replaymanagerservice. The target name used was HTTP/myhost.test.lab. This indicates that the target server failed to decrypt the ticket provided by the client. This can occur when the target server principal name (SPN) is registered on an account other than the account the target service is using. Ensure that the target SPN is only registered on the account used by the server. This error can also happen if the target service account password is different than what is configured on the Kerberos Key Distribution Center for that target service. Ensure that the service on the server and the KDC are both configured to use the same password. If the server name is not fully qualified, and the target domain (TEST.LAB) is different from the client domain (TEST.LAB), check if there are identically named server accounts in these two domains, or use the fully-qualified name to identify the server.

Well, this a rather well know issue in the Microsoft world. Take a look here IIS 7+ Kerberos authentication failure: KRB_AP_ERR_MODIFIED. Browse to the possible causes & solutions. You’ll find this situation right in there. So what we do is execute the following command to register the correct SPN for the host or hosts on the Replay Manager service account:

SetSPN -a HTTP/myhost.test.lab TESTreplaymanagerservice

Do note to run this from an elevated command prompt using a account with sufficient AD permissions in AD. You’ll now no longer have to click on the username/password prompt and get rid of that error.

You can verify if the SPB for your hosts exists on your Replay Manager Service account by running:

SetSPN -l  TESTreplaymanagerservice

If this is the biggest issue you’ll ever have with a hardware snapshot service & hardware provider you know you’ve got a good solution.

DELL PowerEdge VRTX Has Potential Beyond ROBO As a Scale Out File Server Building Block

The PowerEdge VRTX

I went to have a chat with Dell at TechEd 2013 Europe in Madrid. The VRTX was launched during DELL Enterprise Forum early June 2013 this concept packs a punch and I encourage you to go look at the VRTX (pronounced as “Vertex”) in more detail here. It’s a very quite setup which can be hooked up to standard power. Pretty energy efficient when you consider the power of the VRTX. And the entire setup surely packs a lot of punch at an attractive price point.

image image

It can serve perfectly for Remote Office / Branch Office (ROBO) deployments but has many more use cases as it’s very versatile. In my humble opinion DELL’s latest form factor could be used for some very nice scale out scenarios. It’s near perfect for a Windows Server 2012 Scale Out File Server (SOFS) building block. While smaller ones can be build using 1Gbps the future just needs 10Gbps networking.

10Gbps, RDMA (iWarp, RoCE)

That’s the first thing I missed and the first thing I was told that would arrive very soon. So I’m  very happy with that. With sufficient 10Gbps ports to servers and  iWarp or RoCE RDMA capable NICs (there’s cheap enough compared to ordinary 10Gbps cards not to have to leave that capability out) we have all we need to function as powerful building block for the Scale Out File Server model with Windows Server 2012 (R2) where the CSV network becomes the storage network leveraging redirected IO. For this concept look at this picture from a presentation a year ago. SMB 3.0 Multichannel and RDMA make this possible.

image

While then I drew the SOFS building blocks out of R720 7 & MD1200  hardware, the VRTX could fit in there perfectly!

Storage Options

Today DELL uses their implementation of Clustered PCI Raid for shared storage which is supported since Windows Server 2012. This is great. For the moment it’s a non redundant setup but a redundant one is in the works I’m told. Nice, but think positive, redirected IO  (block level) over SMB 3.0 would save our storage IO even today. It would be a very wise and great addition to the capabilities of this building block to add the option & support for Storage Spaces. This would make the Scale Out File Server concept shine with the VRTX.

Why? Well I would give us following benefits in the storage layer and the VHDX format in Hyper-V can take benefits :

  • Deduplication
  • Thin Provisioning
  • Management Delegation
  • UNMAP
  • Write Cache
  • Full benefits of ReFS on storage spaces for data protection
  • Automatic Data Tiering with commodity SSD (ever cheaper & bigger) and SAS disks perhaps even the Near Line ones (less power & cooling, great capacity)
  • Potential for JBOD redundancy

Look at that feature set people, in box, delivered by Windows. Sweet! Combine this with 10Gbps networking and DELL has not only a SOFS building block in their port folio, it also offers significant storage features in this package. I for one would like them to do so and not miss out on this opportunity to offer even more capabilities in an attractive price package. Dell could be the very first OEM to grab this new market opportunity by supporting the scale out approach and out maneuver their competitors

Anything Else?

Combine such a building block as described above with their unmatched logistical force for distribution and support this will be a hit a a prime choice for Windows shops. They already have the 10Gbps networking gear & features (DCB) in the PowerConnect 81XX & Force10 S4810 switches. It could be an unbeatable price / capabilities / feature combo that would sell very well.

If we go for SOFS we might need more storage in a single building block with a 4 node cluster. Extensibility might be nice for this. More not just as in capacity but I need to work out the IOPS the available configurations can give us.