Dell iDRAC 6 Remote Console Connection Failed

Dell iDRAC 6 Remote Console Connection Failed

I recently had the honor to fix a real annoying issue with the iDRAC on rather old DELL hardware, R710 servers that are stilling puling their weight. They have been upgraded to the latest firmware naturally and DELL allows access to those updates to anyone without the need for a support contract (happy users/customers).You can perfectly configure Java site exceptions and use Firefox or Chrome to connect to it (IE is different story, you can connect but the view is messed up). Anyway the browser isn’t the big issue. The  problem was that Dell iDRAC 6 remote console connection failed consistently at the very last moment with “Connection Failed”

image

image

Note: are you nuts?

Yes I like 25/50/100Gbps RDMA, S2D, All Flash etc. I do live the vanguard live on the bleeding edge, but part of that is funding solutions that fit the environment. In this case. They have multiple spare servers and extra disks on top the ones they use in the lab or even in production. So even when a server or a component fails they can use that to fix it. They have the hands on and savvy staff members to do that. No problem. This is not an organization driven by fear of risk and responsibility but by results and effective TCO/ROI. They know very well what they can handle and what not. On top of that they know very well what part of IT sectors sales and marketing promises/predictions are FUD and which are reality. This means they can make decisions based on optimizing for their needs delivering real results.

Leveraging old hardware does mean that sometimes you’ll  run into silly issues but annoying issues like older DRAC cards with modern client operating systems, browsers and recent Java versions.

Most tricks are to be found on line to get those to work together but sometimes even those fails. First of all make sure all network requirements are in order (ports, firewall etc) and on top of that:

  • Upgraded the DRAC Firmware to the latest v2.85
  • Add DRAC IP into the Java Exception List.
  • Change Java Network Setting from Browser to Direct Connect
  • Hack the Java config files
  • Disable Encrypted Video on the DRAC
  • Reset the DRAC
  • On top of this you can run and older version of the browser and Java but at a certain point this becomes a silly option. You see at a given moment the entire stack as moved ahead and one trick like running an old version of Java won’t do it anymore and keeping a VM around that’s at a 10 year old tech/version level is a pain.

The missing piece for me: generate & upload SHA256 certs

So let me share you what extra step got the remote console of the DELL R710 iDRAC to work with the most recent version of Java, Windows 10 and the latest of the greatest Firefox browser at the time of writing.

The trick that finally did it is to generate a CSR on the DRAC while you are connected to it. You see, many people never upload their own certs and if they did, it might have been many years ago. Those old SHA1 certs are frowned upon by modern browsers and Java.

image

image

Open the CSR file, copy the content and submit it to a PKI you have or a free one on line like at getacert.com. Just fill out some random info in the request and you’ll get a SHA256 cert for download immediately that “valid” a couple of months. Enough for testing or getting out of a pickle. Your own corporate CA will do better for long term needs.

image

On top of that you’ll need to reset the DRAC card and give it a few minutes.

image

Reconnect to the DRAC and after that, without failure, we could connect to the on all R710 servers where before we kept getting the dreaded “Connection Failed” error otherwise.

That’s it! Good luck.

Kemp LoadMaster OEM Servers and Dell Firmware Updates with Lifecycle Controller

When you buy a DELL OEM based Kemp Technologies LoadMaster you might wonder who will handle the hardware updates to the server. Well Dell handles all OEM updates via its usual options and as with all LoadMasters Kemp Technologies handles the firmware update of the LoadMaster image.

KempLM320

Hardware wise both DELL and Kemp have been two companies that excel in support. If you can find the solution that meets your needs it’s a great choice. Combine them and it make for a great experience.  Let me share a small issue I ran into updating Kemp Loadmaster OEM Servers and Dell Firmware Updates with Lifecycle Controller

For a set of DELL R320 loadmasters in HA is was upgrading ( I not only wanted to move to 7.1-Patch28b-BARE-METAL.bin but I also wanted to take the opportunity to bring the firmware of those servers to the latest versions as that had been a while (since they had been delivered on site).

There is no OS that runs in those server,s as they are OEM hardware based appliances for the Loadmaster image. No worries these DELL servers come with DRAC & Lifecycle controllers so you can leverage those to do the firmware updates from a Server Update Utility ISO locally, via virtual media, over over the network, via FTP or a network share. FTP is either the DELL FTP Site or an internal one.

image

image

Now as I had just downloaded the  latest SUU at the time (SUU-32_15.09.200.74.ISO – for now you need to use the 32 bit installers with the life cycle controller) I decided to just mount it via the virtual media, boot to the lifecycle controller and update using local media.

image

image

But I got stuck  …

It doesn’t throw an error but it just returns to the start point and nothing can fix it. Not even adding “/repository”  to the file path . You can type the name of an individual DUP (32 bit!) and that works. Scanning the entire repository however wouldn’t move beyond step 2 “Enter Access Details”.

Scanning for an individual DUP seemed to work but leaving the file path blank while trying to find all eligible updates seemed not to return any results so I could not advance. The way I was able to solve this was by leveraging the DRAC ability to update it own firmware using the firmware image file to the most recent version. I just got mine by extracting the DUP and taking the image file from the payload sub folder.

image

You can read on how to upgrade DRAC / Lifecycle Controller via the DRAC here.

image

When you’ve done that, I give the system a reboot for good measure, and try again. I have found in all my cases fixes the issue. My take on this is that older firmware can’t handle more recent SUU repositories. So give it a try if you run into this and you’ll be well on your way to get your firmware updated. If you need help with this process DELL has excellent documentation here in “Lifecycle Controller Platform Update/Firmware Update in Dell PowerEdge 12th Generation Servers”

image

image

image

The end result is a fully updated DELL Server / Kemp Loadmaster. Mission accomplished. All this can be done from the comfort of your home office. A win-win for both you and your customer/employer. Think about it, it would be a shame to miss out on all the benefits you get from working in the cloud when your on premises part of a hybrid infrastructure forces you to get in a car and drive to a data center 70 km away. Especially at 21:21 at night.

Remote Access to the KEMP R320 LoadMaster (DELL) via DRAC Adds Value

If you have a virtual Loadmaster you gain a capability you do not have with an appliance: console access. You can have lost all network connectivity to the Loadmaster but you can still gain access over the Hyper-V console connection to the virtual machine. Virtual appliances are not the only or best choice for all environments and needs. When evaluating your options you should consider going for a bare metal solution like the DELL R320.

image

These are basically DELL servers and as such have a Dell Remote Access Card (DRAC) that allows for remote access independently of the production network. Great for when you need to resolve an issue where you cannot connect to the unit anymore and you’re not near the Loadmaster. It also allows for remote shutdown and start capabilities, mounting images for updates, … all the good stuff. Basically it offers all the benefits of a DELL Server with a DRAC has to offer.

image

That means I have an independent way into my load balancer to deal wit problems when I can no longer connect to it via the network interface or even when it is shut down. As we normally telecommute as much as possible, either from the offices, on the road or home this is a great feature to have. It sure beats driving to your data center at zero dark thirty if that is even a feasible option. image

I know that normally you put in two units for high availability but that will not cover all scenarios and if you have a data center filled with DELL PowerEdge servers that have DRAC and you cannot restore services because you cannot get to your load balancers that’s a bummer. It’s for that same reason we have IP managed PDU, OOB capabilities on the switches. The idea is to have options and be able to restore services remotely as much as possible. This is faster, cheaper and easier than going over there, so reducing that occurrence as much as possible is good. Knowledge today flies across the planet a lot faster than human being can.

DELL Server DRAC Card Soft Reset With Racadmin

Sometimes a DRAC goes BOINK

Sometimes a DRAC (Dell Remote Access Card) can give you issues. Sometimes it’s some lingering process or another hiccup that causes this. You can try a reboot but that doesn’t always fix the issue. You can go into the BIOS and cancel any running System Services. A “confused” DRAC card can also be fixed by shutting down the server and cutting power for 5 to 10 minutes. That’s good to know as a last resort but not very feasible a lot of times, bar a maintenance window when you’re on premise.

You can also try to do a local or a remote reset of the DRAC card via OpenManage  (OMSA), racadmin. See RACADM Command Line Interface for DRAC for more information on how and when to use this tool. The racadmin can be used for a lot of remote configuration and administration and one of those is a “soft reset” or basically a powercycle, aka reboot, of the drac card itself. Don’t worry your server stays up Smile.

Local: racadmin racreset soft

Remote: racadm -r <ip address> -u <username> -p <password> racreset soft

Real life example

I was doing routine maintenance on 4 Hyper-V clusters and as part of that DUPs (Dell update packages) were being deployed to upgrade some firmware. This can be automated nicely via Cluster Aware Updating and the logging option will help you pin point the issue. See https://blog.workinghardinit.work/2013/01/09/logging-cluster-aware-updating-hotfix-plug-in-installations-to-a-file-share/ for more information on this.

Just like we found that the DRAC upgrade was not succeeding on two nodes.

One it was due to the DUP not being able to access the Virtual USB Device

Software application name: iDRAC6
   Package version: 1.95
   Installed version: 1.92

Executing update…

Device does not impact TPM measurements.

Device: iDRAC6, Application: iDRAC6
  Failed to access Virtual USB Device

==================> Update Result <==================

Update was not applied

================================================

Exit code = 1 (Failure)

and the other was because there was some other lingering DRAC process.

 iDRAC is currently unable to process this request because of another task.
  Please attempt one or more of the following steps to cancel the pending iDRAC task:
  1) Wait 30 minutes and retry your request.
  2) Reboot the system; Press F10; select ‘Exit and Reboot’ from Unified Server Configurator, and retry your request.
  3) Reboot the system; Press Ctrl-E; select ‘System Services’. Then change ‘Cancel System Services’ to YES, which will close the pending task;
      Then press Enter at the warning message. Press ESC twice and select ‘Save Changes and Exit’ and retry your request.

==================> Update Result<==================

Update was not applied

================================================
Exit code = 1 (Failure)

They give some nice suggestions but the racreset is another nice one to have I your toolkit. It’s fast and effective.

Run racadmin racreset soft

image

Wait for a couple of minutes and then run the DUP or the items in SUU that failed. With some luck this will succeed now.

image