Check vhid, password and virtual IP address Kemp LoadMaster

Introduction

Recently I was implementing a high available Kemp LoadMaster X15 system. I prepared everything, documented the switch and LM-X15 configuration, and created a VISIO to visualize it all. That, together with the migration and rollback scenario was all presented to the team lead and the engineer who was going to work on this with us. I told the team lead that all would go smoothly if my preparations were good and I did not make any mistakes in the configuration. Well, guess what, I made a mistake somewhere and had to solve a Kemp LoadMaster ad digest – md2=[31084da3…] md=[20dcd914…] – Check vhid, password and virtual IP address log entry.

Check vhid, password and virtual IP address

As, while all was working well, we saw the following entry inundate the system message file log:

<date> <LoadMasterHostName> ucarp[2193]: Bad digest – md2=[xxxxx…] md=[xxxxx…] – Check vhid, password and virtual IP address

Check vhid, password and virtual IP address
every second …

Wait a minute, as far as I know all was OK. The VHID was unique for the HA pair and we did not have duplicate IP addresses set anywhere on other network appliances. So what was this about?

Figuring out the cause

Well, we have a bond0 on eth0 and eth2 for the appliance management. We also have eth1 which is a special interface used for L1 health checks between the Loadmasters. We don’t use a direct link (different racks) so we configure them with an IP on a separate dedicated subnet. Then we have the bonds with the VLAN for the actual workloads via Virtual Services.

We have heartbeat health checks on bond0, eth1 and on at least one VLAN per bonds for the workloads.

Confirm that Promiscuous mode and PortFast are enabled. Check!
HA is configured for multicast traffic in our setup so we confirm that the switch allows multicast traffic. Check!

Make sure that switch configurations that block multicast traffic, such as ‘IGMP snooping’, are disabled on the switch/switch ports as needed. Check!

Now let’s look at possible causes and check our confguration:

So what else? The documentation states as possible other causes the following:

  1. There is another device on the network with the same HA Virtual ID. The LoadMasters in a HA pair should have the same HA Virtual ID. It is possible that a third device could be interfering with these units. As of LoadMaster firmware version 7.2.36, the LoadMaster selects a HA Virtual ID based on the shared IP address of the first configured interface (the last 8 bits). You can change the value to whatever number you want (in the range 1 – 255), or you can keep it at the value already selected. Check!
  2. An interface used for HA checks is receiving a packet from a different interface/appliance. If the LoadMaster has two interfaces connecting to the same switch, with Use for HA checks enabled, this can also cause these error messages. Disable the Use for HA checks option on one of the interfaces to confirm the issue. If confirmed, either leave the option disabled or move the interface to a separate switch.

I am sure there is no interference from another appliance. Check! As we had checked every other possible cause the line in red caught my attention. Could it be?

Time for some packet captures

So we took a TCP dump on bond0 and looked at it in Wireshark. You can make a TCP dump via debug options under System Log Files.

Check vhid, password and virtual IP address
Debug Options, once there find TCP dump.

Select your interface, click start, after 10 seconds or so click stop and download the dump

TCP dump

Do note that Wireshark identifies this as VRRP, but the LoadMaster uses CARP (open source) do set it to decode as CARP, that way you’ll see more interesting information in Info

No, not proprietary VRRP but CARP

Also filter on ip.dst == 244.0.0.18 (multicast address). What we get here is that on eth0 we see multicasts from eth1. That is the case described in the documentation. Aha!

Check vhid, password and virtual IP address
Aha, we see CARP multicasts from eth1 on eth0, that is what we call a clue!

So now what, do we need to move eth1 to another switch to solve this? Or disable the HA check? No, luckily not. Read on.

The fix for Check vhid, password and virtual IP address

No, I did not use one or more separate switches just to plug in the heartbeat HA interfaces on the LoadMasters. What I did is create a separate VLAN for the eth0 HA heartbeat uplink interfaces on the switches. This way I ensure that they are in a separate unicast group from the management interface uplinks on the switches

By selecting a different VLAN for the MGNT and Heartbeat interface uplink they are in different TV VLAN groups by default.

By default the Multicast TV VLAN Membership is per VLAN. The reason the actual workload interfaces did not cause an issue when we enabled HA checks is that these were trunk ports with a number of allowed VLANs, different from the management VLAN, which prevents this error being logged in the first place.

That this works was confirmed in the packet trace from the LM-X15 after making the change.

No more packets received from a different interface. Mission accomplished.

So that was it. The error was gone and we could move along with the project.

Conclusion

Well, I should have know as normally I do put those networks not just in a separate subnet but also make sure they are on different VLANs. This goes to show that no matter how experienced you are and how well you prepare you will still make mistakes. That’s normal and that’s OK, it means you are actually doing something. Key is how you deal with a mistake and that why I wrote this. To share how I found out the root cause of the issue and how I fixed it. Mistakes are a learning opportunity, use them as such. I know many organizations frown upon a mistake but really, these should grow up and don’t act this silly.

LoadMaster LMOS 7.2.52 firmware feature enhancements

LoadMaster LMOS 7.2.52 firmware feature enhancements

Let’s take a look at some of the recently released LoadMaster LMOS 7.2.52 firmware feature enhancements. You can read the release notes on their web site. Next to security updates there are two entries in the new features and change notices that caught my attention. The first is the new feature that delivers the Ability to use SNI in SubVS, as well as SNI-Hostname Pass-Through. Secondly, it is the enhancement that we can now configure Per-VS Health Check Settings. Both are very welcome as I have to deal with such scenarios or needs frequently. So let’s take a look.

Ability to use SNI in SubVS, as well as SNI-Hostname Pass-Through

The Server Name Indication (SNI) feature has been enhanced to support the following:

  • The ability to pass through the original hostname as the SNI hostname to the Real Server.
  • The ability to specify a different (manual) SNI hostname per SubVS. This is the same as the previous functionality to specify this on the parent Virtual Service (Reencryption SNI Hostname) but on the SubVS level with content switching.


These new features help make scenarios, where you may want to consolidate as many services as possible to the least amount of IP addresses, easier and less confusing to implement.

The Pass-through SNI hostname check box is available in the SSL Properties section of the Virtual Service modify screen. When this is enabled and when re-encrypting, the received SNI hostname is passed through as the SNI to be used to connect to the Real Server. If the Virtual Server has a Reencryption SNI Hostname set, this overrides the received SNI.

LoadMaster LMOS 7.2.52 firmware feature enhancements
The pass-through SNI hostname can be overridden at the VS level

It is also possible to set the re-encryption SNI hostname in a SubVS (in the Basic Properties section). If it is set in a SubVS, this overrides the parent Virtual Service value and/or the received SNI value.

LoadMaster LMOS 7.2.52 firmware feature enhancements
You can also define the Reencryption SNI Hostname at the subVS level

Per-VS Health Check Settings

Until now, health check settings were global-only, located on the Rule & Checking > Check Parameters UI page: Check Interval, Connect Timeout, and Retry Count

LoadMaster LMOS 7.2.52 firmware feature enhancements
We still have global Service Check Parameters, but the can be overridden now at the VS and subVS level

Now, we can also configure these settings on the Virtual Service Real Servers tab. This means we can tune the health check behavior for specific VSs and SubVSs. By default, real server health checks use the global settings. The VS or SubVS settings change as the global settings change. So the default behaves as in previous releases.

LoadMaster LMOS 7.2.52 firmware feature enhancements
You can mix: customize the check interval, use the defautl value for the timeout and leverage the global setting for the retry value.

Once you change a check parameter on a VS or SubVS level, however, those custom VS or SubVS settings will remain unchanged regardless of changes made to the global setting. The UI indicates whether the currently in-use value is the global value or is set to a custom value.

Conclusion

I am really happy with the enhancements that are introduced with FW 7.2.52. The two I highlighted above are the ones I really come up against in the future. Not having a re-encryption SNI hostname on the SubVS level was something we could workaround (see How to Re-Encrypt Multiple SNIs on the Same IP and Port with Kemp LoadMaster – PART 1 and How to Re-Encrypt Multiple SNIs on the same IP and port with a Kemp LoadMaster – PART 2). But having this feature on the subVS makes life a lot easier.

Being able to set the real server health check interval, timeout, and retry count helps us in those scenarios where we have services with different needs. These global setting where always a balancing act between all the services. So this capability is very welcome as well.

It is great to see Kemp Technologies their offerings evolve and improve. They have established themselves quite well over the years. It makes me happy that way back I chose them for their price/value and excellent support. I still remember the first HA pair (LM-2200) I ever deployed (2011) for a real time reference GPS position system. That was the beginning of a very succesful journey building solutions with LoadMasters using both physical and virtual appliances.

Kemp LoadMaster template import issue: Command serverinit needs 1 parameter(s)

Introduction

The error Kemp LoadMaster template import issue: Command serverinit needs 1 parameter(s) was encounter during a migration project I was involved in recnetly. Recently. The job was to migrate the virtual services of an aging Kemp LoadMaster HA solution (LM-2200, 32 bit) to newer and more capable versions (running 7.2.43, X64 bit). Even more important, migrate to a version that is still in support. The LM-2200 series are like many older ones End of Life (EOL). Kemp still delivers important fixes for critical security issues like this year (7.1.35.5 and 7.1.35.6) but that’s it in long term support (LTS). This is not bad, these have been supported for a very long time but the aging 32-bit hardware has reached the point where replacing the is the only right thing to do.

With Kemp we can select new hardware, virtual machines or appliances in the cloud. So on-premises, hybrid and public cloud needs can be taken care off with the same familiar load balancer / Application delivery controller (ADC).

One technique to migrate Kemp workloads is to export the virtual services to templates and use these to configure the new solution. It is during that process we ran into an error: Kemp LoadMaster template import issue: Command serverinit needs 1 parameter(s)

Kemp LoadMaster template import issueCommand serverinit needs 1 parameter(s)

One of the applications is a mission critical one that has many different virtual services. FTP/FTPS/HTTP/HTTPS and also a bunch of TCP/UDP connectivity over ports that are very application/industry specific. A dozen virtual services, all on the same IP address with different ports and configuration needs.

We exported all of them to templates which we then used to recreate them on the replacement ADC. While doing so we used a different IP address for these newly created virtual services for testing purposes. This test IP address is changesd to the original production IP when ready to make the move and after disabling the old virtual services. That also allows for a quick exit / roll back if needed.

The process of creating a new virtual service requires a little work to be done still such as defining a gateway, SSL certificates, adding real servers, … but the bulk of the work is done for you.

But with certain templates we got an error: Syntax error detected in template file: Command serverinit needs 1 parameter(s)

clip_image001

When looking at the template in a text editor we see the following:

clip_image002

Clearly “servinit” does not have a parameter set. So, it probably needs one, to specify which type of the Server Initiating Protocol needs to be set for our generic service.

clip_image003

Finding a solution

We need to find out what parameter goes with our setting and add that the template. That value is found is easy enough. Via educated trial and error. On a test virtual service on the current version ADC we set the value for the Server Initiating Protocols to our value (other server initiating) and then export this virtual service to a template. We look at the template and note the number. That way we map what server initiating protocol corresponds to what value. In our case other server initiating is “3”.

clip_image004

We edit our exported templates to have the correct parameter value set that we find out this way and try to import it again.

clip_image005

That’s all that we needed to do to get these exported templates to be imported with no further issues.

clip_image006

clip_image008

All that’s left to do is to finish configuring the virtual service and continue our migration.

Conclusion

Don’t panic. Many problems you encounter have a solution, workaround or fix. Maybe this will help someone out there. Take care and until next time.3

In place upgrade of RD Gateway farm nodes to Windows Server 2016 removes the Loopback adapter for UDP load balancing

Here’s a quick heads up to anyone who’s involved in upgrading existing Windows Server 2012 (R2) RD Gateway farms to Windows Server 2016.

In my recent experiences the in place upgrade (VMs) works rather well. Just make sure the netlogon service is set to automatic (a know issue and a fix is coming) after you upgrade and install all updates. Also make sure that you don’t have this issue

Windows Time Service settings are not preserved during an in-place upgrade to Windows Server 2016 or Windows 10 Version 1607

There is however one networks specific issue specific you’ll need to deal with when leveraging UDP with a load balancer via Direct Server Return.

When you have a RD Gateway farm you load balance it with a (preferably high available) load balancer like a Kemp Loadmaster. I have described this in these blogs/videos Load balancing Hyper-V Workloads With High To Continuous Availability With a KEMP Loadmaster and Quick Demo Video Of Site Failover With KEMP Loadmaster Global Balancing

What you also do is load balance both HTTPS (TCP, port 443) and UDP (port 3391). For UDP we use Direct Server Return ((DSR) as described in my blog post Load balancing UDP for a RD Gateway farm with a KEMP Loadmaster. This requires a properly configured loopback adapter.

image

During the in place upgrade to Windows Server 2016 this loopback adapter is removed form the nodes. So you need to add it back just a described in my original blog post. Normally it will find the settings for it in the registry but it’s bets you check it all out as I’ve found that the loopback adapter did have “Register this connection”s address in DNS” enabled as well as NETBIOS over TCP/IP. So, per my blog post, check it all to make sure. Other than that, after installing all the Windows Server 2016 updates all works smoothly after an in place upgrade.

Hope this helps someone out there!