Updates

To fix bugs and ensure the security of a Yaook environment we need to update the environment regularly. Updates come in 2 variants:

  1. Software/Package updates to fix issues and vulnerabilities

  2. OpenStack and Kubernetes upgrades to stay on top of new and supported releases

Software Updates for the OS / Kubernetes

System packages, Kernels and Kubernetes minor versions are baked directly into the base OS image that is deployed on each node. When an update for the base image is available, we rebuild each node of the Kubernetes cluster one at a time. To do this without downtime we rely on live-migration and Kubernetes native failovers.

The following update process is orchestrated by the !TBD! Controller:

  1. mark the node as required for evacuation

  2. live-migrate any existing VMs away from this node (if applicable)

  3. migrate any l3 and dhcp agents away from this node (if applicable)

  4. possibly migrate ceph data away from this node (if applicable)

  5. drain the node using k8s native methods

  6. delete the node from k8s (at this point all ConfiguredDaemonSets will become updated to remove this node)

  7. shutdown the server

  8. run a normal deploy workflow for this server

Software Updates for Container images

Whenever a software or its dependency in a container image changes, a new container image is built. This causes an update to the spec of Deployments and ConfiguredDaemonSets which will restart the Pods according to the specified limitations.

Docker images are versioned using Semantic Versioning. The Operator is using a yaook/assets/pinned_version.yml file in it’s root directory to determine the version of a image to use. To generate this file (or if it’s not present) the Operator queries the Docker image repository to determine all availabile versions, sorts them and filters for specific requirements. Afterwards the latest image is selected for usage.

Some resources need to be recreated instead of updated in place. The order in which the resources will be recreated is based on the _sort_instances function in the instancing.py. But basically it boils down to this order: 1. A Instance with ResourceRunState.UNKNOWN will be put first 2. Based on the number of reasons, the resource has to be recreated (more is better) 3. Based on the creation Timestamp (older is better) The criteria above are evaluated in order; the first one to be not equal for two instances will be the tiebreaker.

Kubernetes upgrades

TBD

Openstack upgrades

To support multiple versions of openstack all openstack services have a .spec.targetRelease field. In this field you can specify the release of this openstack service you would like to use (e.g. yoga).

In order to upgrade to a new openstack release ensure that your service has rolled out successfully (so that it is in the Updated state). Afterwards you can change the .spec.targetRelease of your service. If the service is capable of non-disruptive upgrades this method will be used. Otherwise the upgrade process might cause disruptions.

Note

You can only upgrade a single version at a time. Jumping over releases is not supported.

The upgrade is finished once the service is again in the Updated state.

If at any point you want to view the openstack version currently rolled out at the service you can check .status.installedRelease. This field will always point to the last release that has completed a full rollout. So during normal operation it will be the same as .spec.targetRelease. During upgrades it will show the version you are upgrading from.

For documentation on how the upgrade procedure is implemented please view Openstack Upgrades.