Build the operator¶
Before you start, you first need a functional Development Setup.
While you create new files, always add the following copyright disclaimer at the beginning:
#
# Copyright (c) 2020-<current-year> The Yaook Authors.
#
# This file is part of Yaook.
# See https://yaook.cloud for further info.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
#
The order of the following steps is not meant to be taken strictly. You will be probably often jump back and forth between some of them.
Reference Docker images
Add required Docker images to
./yaook/assets/pinned_version.yml. Usually there should be one entry for each minor version.Create the CR class
Create a class inheriting
sm.ReleaseAwareCustomResourceand add the class for the CR:./yaook/op/<newcomponent>/<newcomponent>.py¶import yaook.statemachine as sm class NewComponent(sm.ReleaseAwareCustomResource): API_GROUP = "yaook.cloud" API_GROUP_VERSION = "v1" PLURAL = "newcomponentdeployments" # changeme KIND = "NewComponentDeployment" # changeme RELEASES = ["2025.1"] # changeme # Usually supported versions except for the lowest one. If you only support a single version keep it empty. VALID_UPGRADE_TARGETS = [] def __init__(self, **kwargs): super().__init__(assemble_sm=True, **kwargs) sm.register(NewComponent)
Create the
__init__.pyfile:./yaook/op/<newcomponent>/__init__.py¶from .newcomponent import NewComponent # noqa:F401
Add subresources to the CR class
Add class instances of the
statemachinemodule as class members toNewComponentas previously evaluated.Create jinja templates
Create the corresponding jinja templates for the Kubernetes manifests. These will be placed inside
./yaook/op/<newcomponent>/templatesor./yaook/op/infra/templatesfor infra resources.If the user is not set inside the Docker image, also add the
securityContext.runAsUser,securityContext.runAsGroupandsecurityContext.fsGroupdirectives, together with the user ID you determined inside Containers, to each statefulset, deployment and job.Add cue files for configuration
For OpenStack components, first add the packages listed inside the upstream config generator configuration (e.g. barbican.conf) to
./buildcue.py. This is used to generate the cue template for the configuration.To learn how to add a new cue configuration template and, if necessary, cue layers, read Working with CUE.
After you create a cue template, add
<newcomponent>to the variablecue_schema_dstsinside./GNUmakefile. Afterwards, and each time you change the cue template, build the template by running:make cue-templatesTo inject values into the cue templates during rendering, make sure to specify
target="<newcomponent>"for eachsm.CueLayerinsideCueSecret.add_cue_layers/CueConfig.add_cue_layers.Now, you can also add the configuration to
add_dependencies=[newcomponent_config]of templated deployments, statefulsets or jobs and reference them inside the jinja template:{{ dependencies['newcomponent_config'].resource_name() }}
Create the Kubernetes CRD and verify functionality
Before you can start testing your operator, you also need to create a K8s CRD. For OpenStack components, you can copy this minimal CRD template and adjust it according to your requirements:
cp ./docs/developer/guides/create_operator/newcomponent-crd.cue ./yaook/helm_builder/Charts/crds/cue-templates/<newcomponent>-crd.cue
Then run
make k8s_helm_install_crdsto install the CRD inside your cluster. Similiar to the configuration templates, this needs to be run after each change.Create an example manifest
./docs/examples/<newcomponent>.yamlfor aNewComponentinstance and apply it to your K8s cluster:newcomponent.yaml
apiVersion: yaook.cloud/v1 kind: NewComponentDeployment metadata: name: my-component spec: api: ingress: fqdn: "mycomponent.yaook.cloud" port: 32443 database: replicas: 1 timeoutClient: 300 storageSize: 8Gi proxy: replicas: 1 backup: schedule: "0 * * * *" issuerRef: name: ca-issuer messageQueue: replicas: 1 keystoneRef: name: keystone region: name: MyRegion targetRelease: <LATEST_RELEASE> newcomponentConfig: DEFAULT: debug: True
Now run the operator:
python3 -m yaook.op -vv newcomponent run
If everything is setup correctly, the newly created operator should start to reconcile the K8s CR and you can start testing and debugging the main functionality.
The next steps will be necessary to adjust the operator for productive environments.
Add scheduling keys
Define a scheduling key for each statefulset, deployment and job as well as the
<NEWCOMPONENT>_ANY_SERVICEscheduling key inside./yaook/op/scheduling_keys.pyand add the.. autoattribute::directives for sphinx.Scheduling keys for templated Kubernetes resources need to be defined as follows for jobs:
./yaook/op/<newcomponent>/<newcomponent>.py¶[ scheduling_keys.SchedulingKey.OPERATOR_<NEWCOMPONENT>.value, scheduling_keys.SchedulingKey.OPERATOR_ANY.value, ]
and as follows for deployments and statefulsets:
./yaook/op/<newcomponent>/<newcomponent>.py¶[ scheduling_keys.SchedulingKey.<NEWCOMPONENT_SERCICE>.value, scheduling_keys.SchedulingKey.<NEWCOMPONENT>_ANY_SERVICE.value, ]
Additionally, API deployments require the
yaook.op.scheduling_keys.SchedulingKey.ANY_API.valuescheduling key.These scheduling keys will be assigned the following strings:
./yaook/op/scheduling_keys.py¶OPERATOR_<NEWCOMPONENT> = "operator.yaook.cloud/cinder" <NEWCOMPONENT_SERCICE> = \ "<newcomponent>.yaook.cloud/<newcomponent-service>" <NEWCOMPONENT>_ANY_SERVICE = \ "<newcomponent>.yaook.cloud/<newcomponent>-any-service"
Then add the corresponding scheduling keys to each templated resource using the
scheduling_keysproperty.For a deployment or statefulset manifest, the scheduling keys will be injected with the following
nodeSelectorTerms:Example¶spec: affinity: nodeAffinity: requiredDuringSchedulingIgnoredDuringExecution: nodeSelectorTerms: - matchExpressions: - key: <newcomponent>.yaook.cloud/<newcomponent-service> operator: Exists - key: namespace.yaook.cloud operator: In values: - <yaook_namespace> - matchExpressions: - key: <newcomponent>.yaook.cloud/<newcomponent>-any-service operator: Exists
Don’t forget to label your K8s nodes before you test this.
SSL encrypt internal traffic
If possible, this should be achieved by configuring the service accordingly. If the software does not support encrypted communication natively, you will have to add additional containers to the k8s resource which handle the encryption. For OpenStack components, you usually only need to configure the multi container approach inside the API deployment, since communication between services is usually achieved via AMQP which Yaook configures to use SSL by default.
For sidecar encryption, you will need the following containers and enable them based on
spec.api.internal:ssl-terminator
ssl-terminator-external
ssl-terminator-internal
You also need to add a corresponding
service-reloadcontainer for eachssl-terminatorcontainer. As a reference, you can use a template like./yaook/op/barbican/templates/barbican-deployment-api.yaml.The value of the
LOCAL_PORTenvironment variable for thessl-terminatorshould use the default port as listed under OpenStack firewall default ports. TheLOCAL_PORTofssl-terminator-internalshould then increase this port number by 1 and thessl-terminator-externalby 2.Allow resource configuration
Ensure you can configure the Kubernetes resources (requests and limits) for each job, deployment and statefulset and every container that are part of those. Do this by adding
crd.#containerresourcesfor each deployment/sts andjobResourcesto./yaook/helm_builder/Charts/crds/cue-templates/<newcomponent>-crd.cue.This is a snippet of the Cinder CRD cue file to showcase how this is structured for a subset of Cinder services:
./yaook/helm_builder/Charts/crds/cue-templates/cinder-crd.cue¶api: { description: "Cinder API deployment configuration" properties: resources: { type: "object" description: "Resource requests/limits for containers related to the Cinder API." properties: { "cinder-api": crd.#containerresources "ssl-terminator": crd.#containerresources "ssl-terminator-external": crd.#containerresources "ssl-terminator-internal": crd.#containerresources "service-reload": crd.#containerresources "service-reload-external": crd.#containerresources "service-reload-internal": crd.#containerresources } } } scheduler: { description: "Cinder Scheduler deployment configuration" properties: resources: { type: "object" description: "Resource requests/limits for containers related to the Cinder Scheduler." properties: "cinder-scheduler": crd.#containerresources } } jobResources: { type: "object" description: "Resource limits for Job Pod containers spawned by the Operator" properties: { "cinder-db-sync-job": crd.#containerresources "cinder-db-upgrade-pre-job": crd.#containerresources "cinder-db-upgrade-post-job": crd.#containerresources "cinder-db-cleanup-cronjob": crd.#containerresources } }To inject these resources inside the Jinja templates, you must use the
resourcesJinja filter for each container for.spec.containers[@].resources. Inside Cinder templates this is achieved like this:./yaook/op/cinder/templates/cinder-deployment-api.yaml¶resources: {{ crd_spec | resources('api.cinder-api') }} ... resources: {{ crd_spec | resources('api.ssl-terminator') }} ... resources: {{ crd_spec | resources('api.ssl-terminator-external') }} # and so on
./yaook/op/cinder/templates/cinder-statefulset-scheduler.yaml¶resources: {{ crd_spec | resources('scheduler.cinder-scheduler') }}
As you can see, the parameter to the
resourcesfilter always consists of the key of the service from the CR manifest.specand the key ofcrd.#containerresourcesseparated by a dot. For jobs, there is a slight difference as the first part uses the substringjob, notjobResources:./yaook/op/cinder/templates/cinder-job-db-sync.yaml¶resources: {{ crd_spec | resources('job.cinder-db-sync-job') }}
Implement high availability
K8s deployments and statefulsets replicas need to be configurable with the CRD manifest. The setup needs to distribute load and stay functional during rolling restarts. Note that there are exceptions where only a single replica is supported.
The configuration can be supported by adding the following to
propertiesinside the<newcomponent>-crd.cuefor each service:<service-name>: crd.replicatedPotential additional steps depend on the component you want to deploy.
Policy validation (OpenStack only)
Validate optional policy configuration from the K8s manifest by by adding
sm.PolicyValidatorincluding its dependencies toNewComponent. You can use./yaook/op/cinder/__init__.pyas a reference.Add a QuorumPodDisruptionBudget for each deployment and statefulset
Example¶api_deployment_pdb = sm.QuorumPodDisruptionBudget( metadata=lambda ctx: f"{ctx.parent_name}-api_deployment_pdb", replicated=api_deployment, )
Setup monitoring
Use
sm.GeneratedServiceMonitoror a more suitable classes from to./yaook/statemachine/resources/prometheus.pyto setup monitoring.For OpenStack API monitoring, you need to create 3 service monitors:
internal_ssl_service_monitor
external_ssl_service_monitor
internal_ingress_ssl_service_monitor (using
sm.Optional(condition=_internal_endpoint_configured, ...))
Adjust the default config
This depends on your specific requirements. Make sure that the setup can withstand the expected traffic and amount of created datasets and adjust configuration values like worker counts, limits, quotas, etc. accordingly.
Add IPv6 support
If possible, use dual-stack sockets inside configurations and adjust additional configuration if needed by the component.
Additional file changes¶
Reference templates and static files inside
./MANIFEST.inAdjust the file
./docs/examples/<newcomponent>.yaml
Additional file changes for infra operator CRs¶
Add constants you want to reference in other operators to
./yaook/op/common.pyadd your CRD to
AVAILABLE_WATCHERSinside./yaook/op/daemon.py- add the following to
./yaook/statemachine/resources/yaook_infra.py: add another class
NewComponentinheritingYaookReadyResource. Add all k8s manifest keys of the CRD that require other Kubernetes to be updated to the_needs_updatemethod.create a
TemplatedNewComponentclass inside./yaook/statemachine/resources/yaook_infra.pyreference
TemplatedNewComponentinside.. autoclass::and.. autosummary::
- add the following to
reference the
TemplatedNewComponentclass inside./yaook/statemachine/resources/__init__.pyadd a method
NewComponent._interfaceto./yaook/statemachine/interfaces.py
Set up tests¶
Unit tests¶
Unit tests need to be put inside ./tests/op/<newcomponent>/test_api.py for OpenStack CRs and inside ./tests/op/infra/test_<newcomponent>.py for infra resources.
For Openstack CRs, make use of the test cases defined inside ./tests/op/common_tests.py. If you created new cue layers during development, also write tests for those.
Integration tests¶
add an example manifest for the CR to
./ci/devel_integration_tests/deploy/<newcomponent>.yamladd the component to the
CLASS_INFOdictionary inside./ci/devel_integration_tests/os_services.py(OpenStack only)add the CR to the
wait_and_testmethod insideci/devel_integration_tests/run-tests.shand configure test cases that confirm functionality
Add support for tempest tests (OpenStack only)¶
Add the component to the method _get_tempest_suffix_for_service inside ./yaook/op/tempest/__init__.py to return the correct suffix.
You either need to provide the module name from the tempest repository or provide the name and module path of a separate plugin, e.g. for Barbican the barbican-tempest-plugin.
You can confirm that tempest tests are running by creating a TempestJob (make sure the tempest-operator is running). You can use ./docs/examples/tempest-job.yaml to run the tests by adjusting .spec.target.service and .spec.tempestConfig.service_available
(depending on your service, you will also have to adjust other configuration values). Inspect the logs after the job terminated to ensure everything is working as intended.
As long as you are working inside a development cluster, it is not necessary that all of these tests pass each time, but they might serve as an indicator where things might require additional tweaking. Also beware that some tempest test cases may fail simply because they contain bugs.
Create the helm chart¶
Stop the local operator process and create and install the helm chart as described here. Recreate the CR to make sure the resource still reconciles successfully.
Update scripts and documentation¶
add the node labels to
./docs/handbook/user-guide.rstand./docs/developer/guides/dev_setup.rstadd node labels inside
./ci/devel_integration_tests/label-nodes.shto the variableall_node_labelsadd the user and group name with their Docker image ID to
./docs/developer/explanations/containers.rstcreate a user guide if deploying the CRD involves manual steps
./docs/handbook(optional)
If you add new documentation pages, reference them inside the correct index.rst file.