Changing the Avi Load Balancer license tier and how to fix the validation errors

I've been playing around with vSphere with Tanzu for a while, both with and without NSX-T. Initially the no-NSX-T load balancer option was to use a pre-built HAProxy appliance provided by VMware. That worked, and still does, but when vSphere 7.0 U2 released we got the option to use the NSX Advanced Load Balancer (ALB), formerly known as Avi Vantage which in many ways are more powerful, and it also comes with an UI that HAProxy doesn't have.

The NSX Advanced Load Balancer comes in different editions, and after deploying the Avi controller the platform will use an evaluation license of the Enterprise edition which has a lot more features available than the Essentials license that comes with Tanzu Basic and Standard.

Many of us seems to hold off on changing/adding the correct license as we're eager to test out the functionality, but in the case of the Avi Load Balancer this can lead to a lot more work than doing it up front!

TLDR: Change the Avi license to Essentials BEFORE enabling Workload Management/Tanzu, and (optionally) add a Default gateway in Avi to be able to get traffic from your SE's back to your client (if they're not connected to the front-end network)

Avi setup and the license tiers

I wrote a blog post about my Avi setup here, but what I didn't do in my initial testing (even though I mention it in my blog post...), and I suspect this also applies to others, was to change the licensing from the Enterprise eval/trial to Essentials.

Oftentimes we're so eager to test functionality, and as most products, including vSphere, comes with an evaluation license we push back on the licensing bit until we're forced to..

For a lab environment that's brought up and teared down within the eval period, we might not get notified at all, and that's why I first encountered this when helping a customer finalizing their Tanzu production environment.

Tanzu Basic and Standard entitles the use of the Essentials edition

For those running Tanzu Basic and Standard licenses the ALB edition included is Essentials which has a few implications on the functionality of the Load Balancer and it's configuration. I.e. there's only the Legacy HA mode available, the LB is L4 only etc

A list of the features available in the different editions can be found here

This leads to errors when we try to change the license after Workload Management/Tanzu has created SE's, Virtual Services etc since the Controller starts out at the Enterprise license level which has defaults not compliant with the Essentials edition. (Note, this isn't specific to Tanzu enablement, any configuration of Virtual Services etc with the defaults of the Enterprise edition would lead to these problems)

Initial evaluation license

Before looking at what we'll need to do to change the license tier after enabling Tanzu, let's look at how to do it before as instructed in the documentation

Changing the license before Tanzu enablement

If we read the VMware documentation we're instructed to add the Essentials license after the initial configuration of the controller (i.e. before assigning a certificate, configuring the SE group and the Virtual/Frontend network).

As per the documentation we're instructed to upload the Essentials license.

The thing is that the Essentials license isn't something we add, it's already in the controller. What we need to is to change the license tier from the Enterprise edition to the Essentials edition

To change license we'll go to the Administration menu, select the Settings tab and the Licensing option. Click the cog icon in the heading and select the Essentials license

Change license through UI

If we follow this we are almost done after the license change, but there's one important thing that is not covered in the documentation (at least I haven't found it), and that is the routing for the Service Engine(s) besides for the Workload Management network route.

If your client(s) does not sit on the Service Engine Frontend network, the SE(s) might have an issue with knowing where to return traffic to clients that are connecting.

In Enterprise there is a Auto Gateway setting which will take care of this, but that's not available in the Essentials edition. Therefore we need to add static routes not only for the Workload network (which is covered by the documentation), but also for the client networks.

In my case I've added a Default gateway by adding a Static Route of 0.0.0.0/0 through the gateway on the frontend network my SE's are connected to

Default Gateway / Static route

And that's it, we can now continue with configuring and preparing Avi for Workload Management/Tanzu with the correct license edition.

Changing the license after Tanzu enablement

So, what happens if we try to change the license after having configured and tested Tanzu?

License change error

The Config validation failed error message instructs us to run an API or a CLI command to validate and check the configuration. Let's run it through the CLI. This is also described in the Avi documentation

1show configuration audit tier essentials

This is the result in my environment with a Tanzu cluster configured and integrated with the Avi Load Balancer. I've also created a Tanzu Kubernetes Cluster in a namespace called ns-13

License configuration audit

As we can see there are quite a few configurations that are not supported. Both the Service Engine group, Pool and Virtual Service configurations needs to be amended as well as the SSL certificate.

Configuration changes

A BIG WARNING before we continue: I do not know if these changes affects other parts of the platform, be it Tanzu or Avi, so use at your own risk and take a backup of the current configuration before performing any changes

Some of these can be changed in the UI while others needs to be changed through the CLI.

Service Engine group changes

These settings can be changed from the Basic Settings of the Service Engine Group

Note that changes to the Service Enginge group (like HA mode) is not allowed while there's Virtual Services running on it. To be able to change I've had to disable the Virtual Service(s), change the settings, and re-enable the service(s).

Service Enginge group changes

HA Mode

Field ServiceEngineGroup.ha_mode cannot have HA_MODE_SHARED as its value in ESSENTIALS license tier. Allowed value(s): HA_MODE_LEGACY_ACTIVE_STANDBY.

Set the HA mode to Active/Standby (Legacy HA)

HA monitoring on standy

Field ServiceEngineGroup.hm_on_standby cannot have True as its value in ESSENTIALS license tier. Allowed value: False.

Deselect the Health Monitoring on Standby SE(s)

Memory cache

Field ServiceEngineGroup.app_cache_percent cannot have 10 as its value in ESSENTIALS license tier. Allowed value: 0.

Set the Memory for Caching setting to 0 (defaults to 10%)

Pool changes

This setting can be changed from the individual Pool Advanced settings. Note that we need to change it for every Pool that is created

Pool changes

Ramp duration

Field Pool.connection_ramp_duration cannot have 10 as its value in ESSENTIALS license tier. Allowed value: 0.

Set the Connection Ramp to 0 (defaults to 10)

Virtual Service config change

Some of these settings must be done from the CLI/API.

Virtual Service changes

Auto gateway

Field VirtualService.enable_autogw cannot have True as its value in ESSENTIALS license tier. Allowed value: False

This setting can be changed from the Advanced settings of each Virtual Service. Note that we need to change it for every VS that is created

Deselect the Auto Gateway setting

Note that when we remove the Auto Gateway feature our virtual services might no longer be accessible!

Kubernetes environment not accessible from client

I struggled a while to understand what happened, since everything seems up and running both in vCenter/Workload Mgmt and in Avi.

If we take a look at the tool tip on the setting we get a clue

Auto gateway feature tool tip

Turns out that when this is disabled, our SE's won't know where to send the return traffic to the clients trying to access our Kubernetes environment on the frontend network if the clients are not on the same network.

To fix this we need to add a static route for the SE's to use. In my case I've added a Default route which routes through the front end GW, but essentially I guess I could have added specific routes for all client networks applicable.

Add default route

Note that to configure a Default gateway, create a new Static route and enter 0.0.0.0/0 as the subnet with a corresponding next hop, the Default Gateway text was set by the controller itself. (This default gw would also make the Static route for the Workload network unnecessary)

Kubernetes environment accessible

Learning log policy change

Field VirtualService.analytics_policy.learning_log_policy cannot be set in ESSENTIALS license tier

NOTE! This warning came only AFTER changing the Auto Gateway setting

So far, I've only been able to change this through the CLI (the API would also work I'd imagine)

From the shell navigate to the Virtual Service you need to change (tab completion is your friend here) with the configure virtualservice <vs-name> command

Here we get an output of the current configuration, and we can see that the Analytics policy has a learning log policy enabled.

Learning log policy enabled

As per the validation message the learning_log_policy cannot be set so we'll remove it. Easiest way of doing this is to navigate in to the analyticspolicy and run the no learning_log_policy command

Remove learning log policy

With the learning log policy removed we can save the configuration, and then do a save of the Virtual service with two save commands. This will output the new configuration, and we can verify that the learning log policy is gone

New VS configuration

Cert changes

So far, I've only been able to change this through the CLI (the API would also work I'd imagine)

OCSP config change

Field SSLKeyAndCertificate.ocsp_config cannot be set in ESSENTIALS license tier.

This setting we'll also change from the CLI so head over to the shell and navigate to the certificate with the configure SSLKeyAndCertificate <cert-name> command. This will list the current configuration

Current certificate configuration

Towards the end of the output we can see the current OCSP config

OCSP config

Again, we'll need to remove this config so we'll use the no ocsp_config command. This should output the new configuration, and at the top it should mention that the OCSP config has been removed

Remove OCSP config

Now, let's save the configuration. This should output the cert config and we can again verify that the OCSP config has been removed

Save certificate configuration

Verify config changes

Now with all of these changes done we can run a new audit of the configuration and hopefully it should pass so we can go ahead and change the license level

Configuration audit success

With a successful audit we can go ahead and change the license. This can be done in the UI, or we can perform it directly from the CLI from the configure systemconfiguration path with the default_license_tier essentials command

Change license tier from CLI

Don't forget to save the configuration

Verify license tier

And for reference we can verify the same in the UI

Verify license in UI

Does the Avi version matter?

For reference I initially suspected that the issue with changing licenses had to do with the version of the Avi platform. Because most of the documentation up until now has been done on 20.1.3, but I've been running 20.1.5.

I have tested on both 20.1.3, 20.1.5 and 20.1.6, and all of these versions have issues with changing license after Workload enablement.

Note that there might be slight differences in how to change the license between the versions and patch levels (earlier versions had to use the CLI). There's also stuff that cannot be done in the UI when changing to the Essentials license, i.e. some certificate stuff, again with differences between the versions.

Summary

The short story here is to change the license edition before doing any configuration and enablement of Avi/Tanzu/Workload Management/etc. Oh, and RTFM! It is listed as a step early on in the documentation (albeit the steps were not entirely correct).

Although it's mentioned in the documentation I do wish it was clearer or at least the consequenses was listed. Changing the license after Tanzu enablement means downtime to your services, and IMO a license change shouldn't have to require that. At least there should be some backend automation for this, i.e. the controller knows what to change the settings to by default and asks the user if that should be done (with a warning of what will change) when changing the license.

Thanks for reading and reach out if you have any questions or comments

This page was modified on August 11, 2021: Fixed tags