Kubernetes Custom Resource, Controller and Operator Development Tools
(updated May 20, 2019) Kubebuilder, the Operator SDK and Metacontroller make it easier for third-party developers to build upon the Kubernetes platform, using custom controllers, sometimes called operators, and Custom Resource Definitions (CRDs). They may one day converge toward an official platform SDK, but until then, developers have to choose (or start from scratch). To guide my decision and hopefully yours, I have studied all three tools and tried them on a simple yet useful case: an Ambassador shim that creates dummy Services with annotations (the source of Ambassador's configuration) from custom Mapping objects.
UPDATE (2019-05-20) - Ambassador 0.70 adds support for CRDs, including a Mapping CRD. This blog post is therefore purely educational. If you're here because you want to configure Ambassador using custom resources, just upgrade to 0.70.
UPDATE (2018-10-18) - I've updated this blog post for Kubebuilder v1.0.5 and Operator SDK v0.0.7 and published the full (working and reproducible) experiments on GitHub: ambassador-shim-kubebuilder, ambassador-shim-operator-sdk, and ambassador-shim-metacontroller. I've also added a section about validation. Today, Admiralty is also open-sourcing multicluster-controller. Check out our new blog post.
UPDATE (2018-07-19) - Kubebuilder v1.0.0 was released today. Please refer to the differences with v0 and the migration guide. Overall, it relies on controller-runtime and controller-tools, which have been factored out. The API resembles more that of the Operator SDK, and there is no need to regenerate code after the initial scaffolding, thanks to a new dynamic client.
TL;DR
In short, if Go isn't an option, I strongly recommend Metacontroller, because, without it, you'd have to write your own cache and work queues... or wait until your language's Kubernetes client library catches up with client-go. In Go, I'd still consider Metacontroller for simple and supported use cases. If more flexibility is needed, I'd switch to Kubebuilder, unless I was using the whole Operator Framework (SDK, Lifecycle Manager, and Metering). Kubebuilder is arguably more idiomatic and performant at the moment*, and it is backed by the SIG API Machinery. In any case, remember that Kubernetes controller and operator development tools are still a moving target, APIs may break (Kubebuilder is a "prototype"[EDIT (2018-10-18): not anymore] and the Operator SDK is in "pre-alpha"); but it may take time before the community settles on a standard solution.
* EDIT (2018-10-18): Kubebuilder's API converged toward the Operator SDK's, and the Operator SDK is currently being refactored to use controller-runtime.
Kubebuilder | Operator SDK | Metacontroller | |
---|---|---|---|
Backed by | SIG API Machinery | CoreOS (Red Hat) | Google Cloud Platform |
Architecture | Encapsulation | Framework | |
Pros | Tests and docs scaffolding; Multiple resources and controllers in one project; Great documentation | Simple API; Part of the Operator Framework | Any language; Higher-level abstractions; JSON (dynamic for fast development); Declarative; Great documentation |
Cons | Go only; Could use more abstractions | Go only; Single resource and controller in one project; Reference example doesn't follow best practices | JSON (no static typing by default); Use case has to be compatible |
* EDIT (2018-10-18): I wasn't a fan of the sdk.Get
function signature (func(runtime.Object) error
) until I realized that's just how functions are made "generic" in Go. The function accepts an interface as an argument, which is actually the output of the operation (but not a return value). The caller passes a pointer to a struct that satisfies the interface, and the function mutates it. I'm still not a big fan, because sdk.Get
uses the argument as an input too: the name and namespace of the object to get are taken from the struct.
Introduction
The Kubernetes API provides foundational resources for container orchestration: Deployments, Services, Namespaces, RBAC, etc. Some use cases, however, require additional or higher-level abstractions: you might want to provision and manage external services as if they were Kubernetes objects (e.g. a cloud database), or you might want to consider several Kubernetes objects as one (e.g. MicroService = Deployment + Service). Ideally, you'd like your API to leverage Kubernetes features like kubectl
support, CRUD/watch, labels, etcd storage, HTTPS, authentication, authorization, RBAC, auditing... so you can focus on your business logic. However, you'd rather not fork Kubernetes, or wait for your proposal to make it into a future release (your use case may not even be general enough).
Custom Resource Definition vs. Aggregated API Server
Luckily, the Kubernetes API can be extended at runtime, as long as you follow its conventions. To do so, the current recommendation is to register CustomResourceDefinitions (CRDs) and deploy a controller, sometimes called an operator (see below). When you use CRDs, all of the features mentioned above come out-of-the-box. The other approach is to deploy a custom API server and register it as an aggregated API via an APIService object. It comes with more responsibilities, including the storage of your custom resources. Unless you need a different storage layer than the etcd cluster managed by Kubernetes, or a feature that is not yet supported by CRDs, a custom API server is likely not worth the effort.
Here is a sample CRD (without any advanced features, like validation):
The corresponding custom controller typically runs in-cluster, managed by a Deployment.
On the other hand, here is a sample APIService:
The corresponding custom API server typically consists of a Deployment exposed by a Service; here, named apiserver
, in the ambassadorshim
namespace. Any HTTP request for /apis/ambassadorshim.admiralty.io/v1alpha1/...
received by kube-apiserver (the main Kubernetes API server) is forwarded to apiserver.ambassadorshim
, which is the only one to know about the custom resources (e.g., Mapping) under the ambassadorshim.admiralty.io/v1alpha1
API-version.
Under the Hood
apiextensions-apiserver and kube-aggregator handle the apiextensions.k8s.io/v1beta1
and apiregistration.k8s.io/v1beta1
API-versions, respectively; both are implementations of apiserver and are included as delegates in kube-apiserver. To check which API-versions your cluster can handle, you can run:
When you register CRDs or aggregated APIs, a new API-version is added to the list.
About Validation (2018-10-18)
Aggregated API servers can perform arbitrary validation checks, because they're in charge of storage. However, when CRDs were first introduced, there was no mechanism for synchronous validation. An invalid object would be saved as it came, and its controller would have to update its status to flag it as invalid. But things have improved, and you can now use the OpenAPI v3 schema to validate objects, and use validating webhooks for corner cases.
The Controller Pattern
Whether you use CRDs or an aggregated API, the best way to implement the behavior of your custom resources is using the controller pattern, the same way Kubernetes controls its own resources. It is often described by three adjectives: declarative, asynchronous and level-based; most of its mechanics are implemented in Go in client-go/tools/cache and client-go/util/workqueue and documented in several blogs.
- The user declares the desired state of an object, e.g.
kubectl apply -f mapping.yaml
, wheremapping.yaml
contains the specification (Spec) of a Mapping object. The Kubernetes API responds as soon as the desired state has been stored in etcd, but before the user's intent has been fulfilled--which is to create a dummy Service object, owned by the Mapping object, annotated according to its Spec. - Asynchronously, the controller watches for CRUD events on the Mapping and Service resources. For each event, the state of the corresponding object is cached, and its key goes to a work queue; the key is the namespace and name of the Mapping object, obtained directly, or via owner references for Service events. This logic is the work of listers and informers.
- The work queue is backed by a queue and a set, so that if a key is added multiple times before it is processed, it is only processed once. The processing function gets the latest state of the Mapping and Service objects from cache, updates the Mapping's Status (e.g. Service not yet created, or outdated annotation) and takes action to reconcile it according to the Spec (e.g. create the dummy Service, or update its annotation; other use cases could place calls to external services here). Thus, the control loop is level-based, because it reconciles the desired and observed states based on their latest observations, not on their historical changes.
Development Tools
First Generation
You can build a controller from the building blocks in client-go. Months ago, you would have started from the workqueue example or from sample-controller, which provides a project structure where you can:
- copy/edit structs to define your custom resources;
- run a helper script to generate typed clients, informers, listers and deep-copy functions, thanks to the code-generator binary;
- adapt the syncHandler to implement your business logic.
An aggregated API server has more responsibilities, including storage, out of the scope of this blog post. The apiserver library helps you implement those. A project structure is provided in sample-apiserver, but instead you would most likely use apiserver-builder, which is what I would call an early second-generation tool, because it's a CLI that:
- initializes projects (no need to fork or copy samples);
- generates code, including test scaffolding (no need to run a separate code-generation binary, either directly or via Bash scripts);
- runs the API server and associated controller manager either locally or in-cluster;
- and even generates documentation.
Example: the federation-v2 project uses apiserver-builder[EDIT (2018-10-18): it used to, but switched to Kubebuilder, see below].
Second Generation
Recently, several projects have positioned themselves as SDKs or frameworks for CRDs and controllers, making the creation process a lot easier. The main ones are Kubebuilder, the Operator SDK, and Metacontroller. It is now possible to become familiar with one of those tools, its underlying concepts, and put together a working CRD and controller in under half a day.
- Kubebuilder was released in March. It is mainly the work of Phillip Wittrock (Google, @pwittrock), but the project is hosted by the SIG API Machinery. It is actually a fork of apiserver-builder (see above, by the same author), focusing this time on CRD development (
though it does also come with an undocumented API server install strategy[EDIT (2018-10-18): dropped in the v1]). It is well documented in a Gitbook, which also covers basic concepts (resources and controllers). - The Operator SDK was released in May by CoreOS (Red Hat) as part of the Operator Framework (alongside Lifecycle Manager and Metering, which are out of the scope of this blog post, and deserve their own). It was warmly received by the community. Kubebuilder and the Operator SDK are very similar, but the Operator Framework focuses on application management. Examples include the etcd, Prometheus, Rook, and Vault operators.
- Metacontroller was announced at the end of 2017 by Google Cloud Platform. Its main contributor is Anthony Yeh (@enisoc). It is very different than the two others: the controller pattern is delegated to a framework, the Metacontroller, which runs out-of-process, typically in-cluster. The Metacontroller calls user-provided webhooks (typically consisting of Deployments exposed by Services). The webhooks implement the pattern's processing function, written in any language, accepting and returning JSON. Example: the Vitess operator was implemented in Jsonnet using Metacontroller.
Experimentation
I have tested the three tools on a simple yet useful case. Ambassador, the "Kubernetes-native API gateway for microservices built on Envoy", currently pulls its configuration (mainly Mappings of URL prefixes to Kubernetes Services) from annotations on Services; I wanted Ambassador to be even more Kubernetes native, so I've created a Mapping CRD and a controller that maintains a dummy Service for each Mapping, annotated according to the Mapping's Spec. [EDIT (2019-05-20): Ambassador 0.70 introduces an official Mapping CRD, so this experiment isn't "useful" anymore, but still educational.]
Here is a sample Mapping object configuring Ambassador to proxy requests to /foo/
to the foo
service in the bar
namespace:
And here is the corresponding annotated dummy Service required to configure Ambassador (without our shim, Ambassador users would usually place the annotation on the foo
service directly, but Ambassador annotations can be placed on any Service):
Our shim creates the foo-ambassadorshim
Service and keeps it in sync with the foo
Mapping.
Kubebuilder
EDIT 2018-10-18: This section has been updated for Kubebuilder v1.0.5. The full source code of this experiment is available on GitHub as ambassador-shim-kubebuilder.
Following Kubebuilder's Quick Start Guide, I installed the kubebuilder
binary. Then, in my project folder, I ran:
The first command generated a basic project structure and the second generated code for an empty resource and its controller. With Kubebuilder, you can technically create any number of resources and controllers, and not all resources have to have controllers (e.g. if a resource owns another). The next step was to edit the resource's API, by modifying the MappingSpec
and MappingStatus
structs in pkg/apis/ambassadorshim/v1alpha1/mapping_types.go
:
Note the use of the "json" key in the field tags to customize (un)marshalling. A full Ambassador shim would include more fields in its MappingSpec
. Also, a common pattern that I could have used in MappingStatus
is a Conditions
array, as in many Kubernetes resource APIs.
Then, I implemented the resource's behavior in pkg/controller/mapping/controller.go
:
- The controller needs to watch the Services owned by Mappings, to update a Mapping's status when its corresponding dummy Service is created (asynchronously) and to recreate/update the Service if it is deleted/modified out of the control loop. To do so, I modified the second
Watch()
call in theadd()
function, to watchcorev1.Service{}
, rather than the generated example'sappsv1.Deployment{}
(cf. "Watching Created Resources" in Kubebuilder's documentation). - Also, I modified the
+kubebuilder:rbac
comment-annotation to theReconcile()
method to instruct Kubebuilder to create the RBAC rules needed for running in-cluster. - Finally, I modified the
Reconcile()
method's body for our use case. Compared to the generated example, I added Mapping status updates. Otherwise, the pattern is very similar.
Then, I ran:
Everytime the MappingSpec
and MappingStatus
structs change, or +kubebuilder
comment-annotations are modified, CRD (the validation part) and RBAC manifests must be regenerated.
See the Quick Start Guide for local and in-cluster deployments, testing and documentation generation instructions.
Here are my comments on this experiment:
- There were a few mismatches between the documentation, samples, and the generated code.
- In my opinion, there are still a few too many responsibilities left to the developer:
- checking whether objects exist when your use case doesn't require any action if they don't (thanks to Kubernetes garbage collection);
- setting owner references.
- I did not use the event recorder, finalizers, nor webhooks.
Operator SDK
EDIT 2018-10-18: This section has been updated for Operator-SDK v0.0.7. The full source code of this experiment is available on GitHub as ambassador-shim-operator-sdk.
Following the Operator SDK's User Guide, I installed the operator-sdk
binary. Then, in my organisation folder, I ran:
In one command, I initialized the project and generated code for a resource and its controller, called an "operator" in the context of the Operator SDK. The first difference with Kubebuilder is that one Operator-SDK project only deals with one resource/operator pair (which works well when managing applications).
The next step was to modify the MappingSpec
and MappingStatus
structs in pkg/apis/ambassadorshim/v1alpha1/types.go
, just like with Kubebuilder (see listing above), because both tools use code-generator under the hood. Don't forget to run:
Finally, I implemented the Handle()
function in pkg/stub/handler.go
, with a logic similar to the one implemented in the Kubebuilder experiment above, but adapted to the Operator SDK's API:
- No need to
Get()
the Mapping object, as it is included in theevent
argument. It just needs to be type-casted. - There's no helper function to set owner references, so I made my own.
- TypeMeta MUST be set in the desired Service object. (Kubebuilder auto-filled it based on the struct type.)
See the User Guide for local and in-cluster deployment instructions.
Here are my comments on this experiment:
- The logic of the reference memcached handler in the User Guide is debatable, e.g. it first tries to
Create()
a child Deployment, even if it already exists, then tries toGet()
it even if it doesn't exist (Create()
is asynchronous): that's one too many call either way. I'm hopeful that post-alpha releases will follow best practices; in the meantime, the User Guide does note:"The provided handler implementation is only meant to demonstrate the use of the SDK APIs and is not representative of the best practices of a reconciliation loop."
- The focus on one resource and one control loop can be limiting out of the scope of application provisioning and management.
Metacontroller
EDIT 2018-10-18: This section uses Metacontroller v0.2.0. The full source code of this experiment is available on GitHub as ambassador-shim-metacontroller.
Metacontroller isn't a CLI but a framework that runs in-cluster. I installed it following the User Guide. Metacontroller supports two use cases, which are themselves registered as CRDs (hence the prefix "Meta" in "Metacontroller"):
CompositeController is "designed to facilitate custom controllers whose primary purpose is to manage a set of child objects based on the desired state specified in a parent object. Workload controllers like Deployment and StatefulSet are examples of existing controllers that fit this pattern."
DecoratorController is "designed to facilitate adding new behavior to existing resources. You can define rules for which resources to watch, as well as filters on labels and annotations."
The CompositeController fits our need. It's also used in the User Guide's hello-world example. Here is the manifest of a CompositeController watching our custom resource and associated Services:
Metacontroller uses labels to filter child resources on the server side (Kubebuilder and the Operator SDK filter on the client side using owner references). Note that Metacontroller can generate those labels for you, which is convenient.
The CompositeController object defines a webhook that Metacontroller calls to reconcile the parent status and children's desired state from the parent and children's observed state, declaratively. I implemented the webhook as a simple Python 3 HTTPServer (pyyaml is the only requirement, to marshal the MappingSpec into a dummy Service annotation):
I packaged the webhook as a very simple Docker container managed by a Deployment and exposed by a Service (ambassadorshim-metacontroller
).
Here are my comments on this experiment:
- If Metacontroller supports your use case via either a CompositeController or a DecoratorController, your responsibility is strictly limited to your business logic: no need to check whether the parent exists, or to set owner references, etc. Just beware that you cannot opt out: e.g., children of a CompositeController are always garbage-collected.
- Metacontroller's API is declarative, which makes it easy to reason about. To do so, it provides higher-level abstractions like the update strategy (OnDelete, Recreate, InPlace, RollingRecreate, RollingInPlace).
- Because webhooks only need to accept and return JSON, it is possible to use dynamically typed languages like Python and JavaScript, which are great for rapid development. An added benefit in our case is that the Ambassador shim translates all of a Mapping's specification to annotations out-of-the-box, not just the required fields I cared to support in the Go implementations (Prefix and Service). However, the absence of static typing by default could be error-prone in more complex situations, or require additional tooling.
- Metacontroller is not designed to watch external APIs, whereas Kubebuilder can watch any Go channel.
Discussion
Kubebuilder and the Operator SDK are quite similar: both are only intended for Go developers; both rely on code-generator to some extent; both provide a CLI to set up a new project, regenerate code, build binaries, images, manifests, etc.; both implement the controller pattern as a library. Their APIs and implementation details, however, differ. Kubebuilder uses code-generator more extensively, to generate typed clients, informers and listers for custom resources, whereas the Operator SDK relies instead on client-go's discovery and REST mapping features. The Operator SDK encapsulates existing abstractions and provides a new, simple API, whereas Kubebuilder conveniently injects existing abstractions, which is fine if you're already familiar with them (from client-go), but can otherwise be daunting.[EDIT (2018-10-18): Kubebuilder v1 uses controller-runtime, which uses client-go's discovery and REST mapping features, and whose API resembles that of the Operator SDK.] Kubebuilder can also help generate tests and documentation, supports multiple resources and controllers per project, and makes full use of its cache to limit kube-apiserver calls. On the other hand, in the context of application operators, the Operator SDK could potentially develop synergies with the rest of the Operator Framework (Operator Lifecycle Manager and Operator Metering).
The third contender, Metacontroller, takes a radically different approach: it's an actual framework (as opposed to generated code), where controllers are custom resources themselves, controlled by the Metacontroller, which delegates reconciliation to webhooks; "all you need to provide is a webhook that understands JSON, you can use any programming language." Using Metacontroller, my Ambassador Mapping controller consists of 50 lines of Python (for the webhook), 30 lines of Yaml (for the Mapping CRD and CompositeController) and that's it; no generated boilerplate whatsoever. Metacontroller isn't as flexible as Kubebuilder or the Operator SDK, but it does support most use cases.
Developing custom resources and controllers has definitely gotten easier over the past year, thanks to the tools discussed above. However, Kubernetes platform development is still a fragmented and rapidly evolving landscape. There was a proposal to create a new SIG to oversee the development of standard tools. It has been rejected for now by the steering committee, but the SIG API Machinery is taking Kubebuilder under its wing and has started refactoring[EDIT (2018-10-18): refactored] it into a controller-runtime and controller-tools.
Conclusions
- The Kubernetes API can be extended with custom resources at runtime either via CustomResourceDefinitions (CRDs) or aggregated APIs. CRDs are the easiest, if your use case is compatible.
- The behavior of custom resources (just like Kubernetes resources) is best implemented using the controller pattern, which is declarative, asynchronous, and level-based. client-go provides building blocks for controllers (cache and work queue). Those aren't yet available in other languages' client libraries.
- The building blocks are put together in the workqueue example, sample-controller (for CRDs), and sample-apiserver (for aggregated APIs). The latter two can be forked or copied and modified with the help of code-generator.
- Better tools have subsequently been released to make developers more productive (see table in TL;DR for a recap of pros and cons):
- Kubebuilder (following apiserver-builder) facilitates project creation and code generation in Go, and
injects[EDIT (2018-10-18): encapsulates] client-go's APIs; - The Operator SDK also handles project creation and code generation in Go,
butand encapsulates client-go's APIs. - Metacontroller is a framework. It is a black box that runs separately and calls processing functions as webhooks written in any language, accepting and returning JSON.
- Kubebuilder (following apiserver-builder) facilitates project creation and code generation in Go, and
- The SIG API Machinery has taken upon itself to further improve the Kubernetes platform development experience. A first step is to extract and standardize the controller pattern. It could still be used in various ways in different tools, or a standard platform SDK may replace the current options; Kubebuilder, which is now owned by the SIG, is a likely contender.
Ambassador can be configured with CRDs even if the main project (which is written in Python) doesn't support them yet. The solution is to deploy shim controllers that translate Mapping objects into dummy Services with corresponding annotations. Other CRDs would have to be implemented for a full solution, but hopefully the main project will soon make a shim unnecessary.[EDIT (2019-05-20): Problem solved in Ambassador 0.70.]