Octavia operator

Octavia is an OpenStack service that provides load balancer as a service (LBaaS) functionalities for OpenStack virtual machines. Octavia manages Amphorae which are individual virtual machines in Nova that implement the actual load balancers.

Octavia architecture in a nutshell

The following is a short summary of the basic architecture of Octavia as it is integrated by YAOOK.

The control plane services of Octavia consist of:

Octavia API: API service to interact with Octavia
Worker: instructed by the API service to manage the Amphorae (e.g. initial creation)
Health Manager: monitors the Amphorae by receiving heartbeats from them and initiate Amphora failover if necessary
Housekeeping Manager: cleans up database entries and manages Amphora certificate rotation

Note that both worker and health manager need to connect to Nova, Glance and Neutron to build Amphora VM instances. In case of the worker, this is done during initial Amphora creation based on the API requests. In case of the health manager, this is done when an Amphora is detected as offline and a failover instance is built by the health manager.

The workers will connect via both SSH and a dedicated service port to the Amphorae to configure them. After bootup, each Amphora will start sending continuous heartbeats to a specific UDP port exposed by the health managers.

Please see the integration architecture section for more details on the network structure and inner workings of the Octavia integration in YAOOK.

Deployment

Pre-Requirements

The following resources need to exist before Octavia can be deployed:

Nova flavor for Amphora VMs
Glance image for Amphora VMs
Octavia management TLS certificates

Note

OpenStack resources such as flavors and images must be created within the “service” project where the Octavia service will later create the Amphora VMs.

Preparing the Amphora flavor

To create the flavor, the openstack flavor create command may be used. For example:

openstack flavor create \
    --id octavia-amphora \
    --vcpus 1 \
    --ram 1024 \
    --disk 2 \
    --project service \
    --private \
    amphora-loadbalancer

Note

The operator will default to the flavor ID octavia-amphora when generating the configuration for the Octavia services. In case a different ID is desired, see the CR configuration section below on how to override the default for amp_flavor_id.

Preparing the Amphora image

A prebuilt Amphora image must be uploaded to Glance. Prebuilt images may be taken from OSISM for example: https://github.com/osism/openstack-octavia-amphora-image

To transfer this image to Glance, the openstack image create command may be used. For example:

openstack image create \
    --disk-format qcow2 \
    --container-format bare \
    --project service \
    --private \
    --tag amphora \
    --file octavia-amphora-haproxy-2024.1.qcow2 \
    amphora-x64-haproxy

Note

The operator will automatically configure the Octavia services to look up the image in the “service” project. In case a different project is desired, see the CR configuration section below on how to override the default for amp_image_owner_id.

Preparing the TLS certificates

Octavia requires certificate authorities and a set of certificates for the TLS authentication used by the Octavia control plane services when managing Amphorae. For more details on the TLS authentication architecture of Octavia refer to the official documentation

The steps required to provision the certificates are automated by the script provided in tools/create_octavia_certs.sh.

Warning

The scripted workflow uses certificate validity period values hardcoded in the script. It does not offer any kind of certificate rotation or expiration handling and should not be considered a production-grade way of provisioning and managing the certificates!

Caution

The scripted workflow creates local CAs and private keys on the system it is executed on. It must only be executed in trusted environments and the resulting directory structure should be preserved adequately and securely.

If opting for the scripted certificate provisioning, adjust the settings in tools/octavia_openssl.cnf as necessary first. The script may then be executed on a system that has sufficient access to the Kubernetes API of the YAOOK cluster like this:

tools/create_octavia_certs.sh -n $YAOOK_OP_NAMESPACE

The script will establish local CAs based on the settings in tools/octavia_openssl.cnf and issue the required certificates. At the end it will automatically upload the certificates as Kubernetes Secrets.

In case the certificates are not created using the script, they have to be provided as Kubernetes Secrets manually like this:

kubectl create secret -n $YAOOK_OP_NAMESPACE generic octavia-server-ca-key \
    --from-file=server_ca.key.pem
kubectl create secret -n $YAOOK_OP_NAMESPACE generic octavia-server-ca-cert \
    --from-file=server_ca.cert.pem
kubectl create secret -n $YAOOK_OP_NAMESPACE generic octavia-client-ca-cert \
    --from-file=client_ca.cert.pem
kubectl create secret -n $YAOOK_OP_NAMESPACE generic octavia-client-cert-and-key \
    --from-file=client.cert-and-key.pem

Deploying Octavia with YAOOK

Assigning scheduling labels to Kubernetes nodes

Note

The total amount of worker replicas may be smaller than the amount of health managers but each worker must run on a node where a health manager is present, due to the dependency on the node-local OVS port. For this reason, health managers and workers share the special octavia.yaook.cloud/octavia-managers scheduling label and are excluded from the generic octavia.yaook.cloud/octavia-any-service label! Each node with the shared octavia.yaook.cloud/octavia-managers label will receive exactly one health manager instance and a subset (or all) of these nodes will receive a worker instance depending on the worker replica count.

First, assign the control plane label (for Octavia API and housekeeping services):

kubectl label node $NODE octavia.yaook.cloud/octavia-any-service=true

Next, assign the managers label (for Octavia health managers and workers):

kubectl label node $NODE octavia.yaook.cloud/octavia-managers=true

Configuring and deploying the OctaviaDeployment CR

First, make sure that the Custom Resource Definitions (CRDs) are up-to-date:

make helm-charts
helm upgrade --install --namespace $YAOOK_OP_NAMESPACE yaook-crds yaook.cloud/crds

Next, deploy the Octavia operator:

YAOOK_OP_VERSION=...
make helm-charts
helm upgrade \
    --install \
    --namespace $YAOOK_OP_NAMESPACE \
    --set operator.pythonOptimize=false \
    --set operator.image.repository=registry.yaook.cloud/yaook/operator \
    --set operator.image.tag=$YAOOK_OP_VERSION \
    "octavia-operator" \
    ./yaook/helm_builder/Charts/octavia-operator/

Set YAOOK_OP_VERSION to the desired image release tag or feature branch name of the operator. If using a feature branch, replace the repository URL by registry.yaook.cloud/yaook/operator-test.

Define the OctaviaDeployment configuration

Create an OctaviaDeployment manifest based on the example located at docs/examples/octavia.yaml.

There are two parameters of the Octavia configuration that have defaults but may need adjustments in specific environments. These are:

controller_worker.amp_image_owner_id: ID of the owner of the Amphora Glance image, default = project ID of the “service” project.
controller_worker.amp_flavor_id: ID of the Amphora flavor, default = amphora.

Overriddes for these parameters can be set within the spec.octaviaConfig.controller_worker of the OctaviaDeployment manifest. If the Amphora Glance images are managed in a different OpenStack project, add an override for amp_image_owner_id. If the Amphora Nova flavor has a different ID, add an override for amp_flavor_id.

Adjust other values of the OctaviaDeployment spec as necessary.

Caution

Do not set controller_worker.amp_boot_network_list or controller_worker.amp_secgroup_list in the spec.octaviaConfig section of the OctaviaDeployment manifest. The creation of the corresponding network and security groups is handled by the operator automatically and will result in proper configuration entries. Any manual values will conflict with the resources created by the operator and will result in the OctaviaDeployment failing to reconcile.

Finally, apply the manifest:

kubectl -n $YAOOK_OP_NAMESPACE apply -f docs/examples/octavia.yaml

Increasing the security group quota of the service project

The Octavia operator will manage all Amphorae within the “service” project. This will lead to a creation of at least one security group per load balancer within that project. Considering this covers the load balancer of all tenants, the potential amount of resulting security groups usually exceeds any default quotas set on projects. For this reason, the security group quota of the “service” project must be increased:

openstack quota set --secgroups 1000000 service

The exact quota limit number should be chosen appropriately based on the size of the infrastructure and expected tenant count.

Registering roles

Users that interact with the Octavia API need special roles to do so. These roles are not part of the default role set in YAOOK and need to be registered once:

openstack role create load-balancer_member
openstack role create load-balancer_admin

Octavia offers additional roles for more fine-grained access control. Refer to the official role documentation for more details and create additional roles as necessary.

Caution

The load-balancer_admin role is a role with global scope and allows accessing Octavia resources owned by others. It is only intended for cloud administrators and should never be assigned to tenant users.

Assign roles to user as required:

openstack role add --project $PROJECT_ID --user $USER_ID load-balancer_member

Registering an SSH keypair for Amphora VMs access (optional)

For troubleshooting purposes, a central SSH public key may be registered in OpenStack that will allow access to all Amphora VMs. This key must be assigned to the Octavia service user in OpenStack.

Caution

Generating and registering a keypair for Amphora VMs will result in an SSH key that has administrative access to all Amphora VM’s across tenants. The private key of the keypair should be generated and stored in a secure environment with strict access control.

The OctaviaDeployment CR will create an Octavia service user in Keystone. Its exact name will vary from installation to installation and must first be retrieved:

NAMESPACE=...  # <- set namespace here

kubectl -n$NAMESPACE get secret \
-l state.yaook.cloud/component=credentials,\
state.yaook.cloud/creator-plural=octaviadeployments,\
state.yaook.cloud/parent-plural=keystoneusers \
-o jsonpath='{.items[0].data.OS_USERNAME}' | base64 -d; echo

This will return a Keystone username like this:

octavia-ab123.yaook.cluster.local

(the name suffix - in this example ab123 - will vary)

If not already existent, generate a new SSH key pair. For example, to generate a 4096 bit RSA key:

ssh-keygen -t rsa -b 4096 -f octavia_amp_id_rsa

Then, with admin credentials, register the public key of the keypair in OpenStack to the Octavia service user:

openstack keypair create \
--public-key octavia_amp_id_rsa.pub \
--user octavia-ab123.yaook.cluster.local \
octavia_amphora_ssh

Finally, edit your OctaviaDeployment manifest and set the controller_worker.amp_ssh_key_name configuration entry accordingly:

apiVersion: yaook.cloud/v1
kind: OctaviaDeployment
metadata:
...
spec:
...
octaviaConfig:
  controller_worker:
    amp_ssh_key_name: octavia_amphora_ssh

Update your OctaviaDeployment by reapplying the changed manifest. Any newly created Amphora VM will now use the SSH public key of this keypair.

Removing Octavia

To remove Octavia completely, first delete the OctaviaDeployment CR:

kubectl delete -n $YAOOK_OP_NAMESPACE OctaviaDeployment octavia

Note

When the Octavia operator cleans up all Octavia resources it will attempt to remove the static Amphora management Neutron network and security groups as well. When any leftover resources are still in use, these will fail to delete. In such cases, the Octavia operator will instead rename the affected OpenStack resources and append an “-orphaned” suffix to them. This prevents accidental reuse by future OctaviaDeployment instances. Use openstack security group list and openstack network list to identify these resources and delete them manually.

Remaining PersistenVolumeClaims of Octavia’s AMQP server and database will need to be deleted as well as those will otherwise conflict with any future OctaviaDeployment instances:

kubectl delete -n $YAOOK_OP_NAMESPACE persistentvolumeclaims \
    data-octavia-octavia-mq-mq-0 data-octavia-octavia-db-0

After all resources have been cleaned up, the Octavia operator may be removed as well:

helm uninstall octavia-operator

Finally, the Octavia CA certificates and keys should be removed:

kubectl delete secret \
    octavia-server-ca-key octavia-server-ca-cert \
    octavia-client-cert-and-key octavia-client-ca-cert

Example Workflow in Octavia

The following describes an exemplary workflow for creating and using Octavia load balancers from a user’s perspective.

First, create a subnet with a router in OpenStack, where the load balancer should be attached to:

EXT_PROVIDER_NETWORK=...
openstack network create l2-network
openstack subnet create --subnet-range 192.168.4.0/24 --network l2-network --dhcp l3-network
openstack router create --external-gateway $EXT_PROVIDER_NETWORK test-router
openstack router add subnet test-router l3-network

Next, create the load balancer with listener and pool. In the following example, it is used to load-balance the SSH port 22:

LB_NAME="my-loadbalancer-1"
LB_LISTENER_NAME="my-lb-listener-1"
LB_POOL_NAME="my_pool"
SUBNET_ID=$(openstack subnet show l3-network -f value -c id)
openstack loadbalancer create --name $LB_NAME --vip-subnet-id $SUBNET_ID
openstack loadbalancer list

After the load balancer reaches ACTIVE status, create a listener and pool:

openstack loadbalancer listener create --name $LB_LISTENER_NAME \
    --protocol TCP --protocol-port 22 $LB_NAME
openstack loadbalancer pool create --name $LB_POOL_NAME \
    --lb-algorithm ROUND_ROBIN --listener $LB_LISTENER_NAME --protocol TCP

Now, create the virtual machines that should be load-balanced:

openstack server create --image cirros --network l2-network --flavor S test-cirros1
openstack server create --image cirros --network l2-network --flavor S test-cirros2
openstack server list

For the load-balanced ports to be reachable, create and configure a corresponding security group and assign it to the virtual machines:

openstack security group create --description "SSH" ssh
openstack security group rule create --proto tcp --dst-port 22 ssh
openstack server add security group test-cirros1 ssh
openstack server add security group test-cirros2 ssh

For the next step first retrieve the IP addresses of the VMs:

openstack server show test-cirros1 -f value -c addresses
openstack server show test-cirros2 -f value -c addresses
IP_ADDRESS_VM1=...  # <- insert the 1st VM's IP address here
IP_ADDRESS_VM2=...  # <- insert the 2nd VM's IP address here

Then add the VMs to the pool of the load balancer while specifying their respective IP address:

openstack loadbalancer member create --subnet-id $SUBNET_ID \
    --address $IP_ADDRESS_VM1 --protocol-port 22 $LB_POOL_NAME
openstack loadbalancer member create --subnet-id $SUBNET_ID \
    --address $IP_ADDRESS_VM2 --protocol-port 22 $LB_POOL_NAME

Verify the created members:

openstack loadbalancer member list $LB_POOL_NAME

Add a health monitor to the load balancer pool:

openstack loadbalancer healthmonitor create \
    --delay 5 --timeout 3 --max-retries 3 --type TCP $LB_POOL_NAME

Continuously check the load balancer member list. The operating_status should soon switch from NO_MONITOR to ONLINE for both virtual machines:

openstack loadbalancer member list $LB_POOL_NAME

Finally, create a Floating IP and get the port ID from the load balancer to attach the Floating IP to:

openstack floating ip create $EXT_PROVIDER_NETWORK
FLOATING_IP=...  # <- insert the Floating IP address here

PORT_ID=$(openstack loadbalancer show $LB_NAME -f value -c vip_port_id)
openstack floating ip set --port $PORT_ID $FLOATING_IP

The load balancer is now reachable via the Floating IP through the external provider network and will load-balance connections to the two backend VMs.

YAOOK integration architecture

The following sections explain the Octavia network integration in more detail.

Amphora management network integration

YAOOK integrates the management network required for connectivity between Octavia health managers, workers and the Amphora VMs using a dedicated Neutron network and OVS internal ports on the Kubernetes nodes. The Octavia operator of YAOOK will provision a static Neutron network within the service project for this purpose. Additionally it prepares two security groups: one for the Amphorae and one for the health managers.

For each Kubernetes node that is labeled with the scheduling for the Octavia managers (health managers and worker), the operator will create a Neutron port in the Amphora management network.

../_images/octavia_port_plugging.drawio.svg

The resulting port details (MAC & IP address) of each Neutron port are then passed to a node-specific ConfiguredDaemonSet instance of the health manager Pod. The init container of the health manager Pod will connect to the node-local OVS service and plug an OVS internal port on the integration bridge with the port address. This materializes the Neutron port connected to the Amphora management Neutron network as an OVS port on the node. The health manager will bind to this port for its heartbeat listener. This enables the Amphorae (which are also connected to the Amphora management Neutron network) to reach the health manager. Furthermore, the Octavia workers will be scheduled on the same nodes as the health managers so they can use the OVS internal port to reach the Amphora VMs for configuration.

Each health manager receives a node-specific bind_ip entry to its octavia.conf. This is also why a ConfiguredDaemonSet is used: each health manager configuration file is unique. Additionally, the list of all health manager addresses is collected and stored as controller_ip_port_list in the octavia.conf of all health managers and workers. This list is passed to every Amphora VM on initialization and used as target address list for the heartbeats.

Neutron port provisioning through Custom Resource

Since PerNode loops which are used in the YAOOK code to provision node-specific instanced states strictly require Kubernetes-API-managed resources, the Neutron ports must be managed as Kubernetes Custom Resources (CRs). That’s why the Octavia operator implementation introduces the OctaviaNeutronPort CR to manage Neutron ports via the OpenStack API. This is similar to how the keystone-resources operator manages KeystoneUser CRs for example.

The Octavia operator is both producer and consumer of OctaviaNeutronPort CRs alike. It creates the CRs based on the applicable Kubernetes nodes. Then it processes the CRs in a dedicated reconcile loop, creates the ports via the Neutron API and writes resulting address data to status fields of the corresponding OctaviaNeutronPort CRs. Finally, it consumes the reconciled OctaviaNeutronPort CRs in its main OctaviaDeployment reconcile loop and continues to provision the Octavia services requiring the ports on the nodes based on the ConfigMap data.

../_images/octavia_port_cr_handling.drawio.svg