Sometimes a DRAC goes BOINK
Sometimes a DRAC (Dell Remote Access Card) can give you issues. Sometimes it’s some lingering process or another hiccup that causes this. You can try a reboot but that doesn’t always fix the issue. You can go into the BIOS and cancel any running System Services. A “confused” DRAC card can also be fixed by shutting down the server and cutting power for 5 to 10 minutes. That’s good to know as a last resort but not very feasible a lot of times, bar a maintenance window when you’re on premise.
You can also try to do a local or a remote reset of the DRAC card via OpenManage (OMSA), racadmin. See RACADM Command Line Interface for DRAC for more information on how and when to use this tool. The racadmin can be used for a lot of remote configuration and administration and one of those is a “soft reset” or basically a powercycle, aka reboot, of the drac card itself. Don’t worry your server stays up .
Local: racadmin racreset soft
Remote: racadm -r <ip address> -u <username> -p <password> racreset soft
Real life example
I was doing routine maintenance on 4 Hyper-V clusters and as part of that DUPs (Dell update packages) were being deployed to upgrade some firmware. This can be automated nicely via Cluster Aware Updating and the logging option will help you pin point the issue. See https://blog.workinghardinit.work/2013/01/09/logging-cluster-aware-updating-hotfix-plug-in-installations-to-a-file-share/ for more information on this.
Just like we found that the DRAC upgrade was not succeeding on two nodes.
One it was due to the DUP not being able to access the Virtual USB Device
Software application name: iDRAC6
Package version: 1.95
Installed version: 1.92Executing update…
Device does not impact TPM measurements.
Device: iDRAC6, Application: iDRAC6
Failed to access Virtual USB Device==================> Update Result <==================
Update was not applied
================================================
Exit code = 1 (Failure)
and the other was because there was some other lingering DRAC process.
iDRAC is currently unable to process this request because of another task.
Please attempt one or more of the following steps to cancel the pending iDRAC task:
1) Wait 30 minutes and retry your request.
2) Reboot the system; Press F10; select ‘Exit and Reboot’ from Unified Server Configurator, and retry your request.
3) Reboot the system; Press Ctrl-E; select ‘System Services’. Then change ‘Cancel System Services’ to YES, which will close the pending task;
Then press Enter at the warning message. Press ESC twice and select ‘Save Changes and Exit’ and retry your request.==================> Update Result<==================
Update was not applied
================================================
Exit code = 1 (Failure)
They give some nice suggestions but the racreset is another nice one to have I your toolkit. It’s fast and effective.
Run racadmin racreset soft
Wait for a couple of minutes and then run the DUP or the items in SUU that failed. With some luck this will succeed now.