Hybrid and multicloud architectures are becoming prevalent, either as a strategy or simply as a result of history and/or mergers and acquisitions. Luckily, to help reduce the inherent complexity, Kubernetes is standardizing the way clouds are operated: the same workflow can be used to manage resources in any cloud, whether public or private. However, managing workloads across clouds is still a challenge. Technically, you could create a single Kubernetes cluster encompassing your entire infrastructure, but that could invalidate some of the assumptions made in the design of Kubernetes itself. Also, you would miss out on turn-key Kubernetes distributions. A more common approach is to operate multiple clusters.
Clusters are isolated from each other by default, which helps with:
- fault isolation;
- trust boundaries;
- only paying for a top-tier service level in production;
- enforcing geographical regulations;
However, cluster boundaries get in the way when you’d like to manage the following globally:
- scheduling and autoscaling (ensuring high availability and low latency at the lowest cost);
- service discovery;
- backups and migrations;
- policy enforcement;
We need tools to manage resources across clusters. Specific solutions exist. Notably, federation-v2 can sync workloads and route traffic across clusters. To do so, it uses the concepts of Templates, Placements, and Overrides, propagating resources with a push reconciler.
While building a multicluster scheduler at Admiralty (stay tuned), we needed a lower-level abstraction: namely the controller pattern (sometimes called the operator pattern), but for resources in multiple clusters. We needed a tool like the Operator SDK or Kubebuilder (see comparison in a previous blog post), but supporting multiple clusters. Unfortunately, their designs don’t allow that. Their APIs would have to change significantly. So, rather than submit a pull request, we decided to make our own tool. Luckily, we were able to leverage parts of controller-runtime, the library powering Kubebuilder and now also the Operator SDK.
Today, we’re open-sourcing multicluster-controller. Check out the README for more details on how it works, including how it can be used with custom resources (using CRDs). We’ve also included a few examples. We hope that the community will find the project useful. (Anyone volunteering to build a multicluster Prometheus operator?)
Warning: though we’re already using multicluster-controller internally with great success, the project is still in its infancy and the API may break in future releases. Also, a few must-have features are still in the works:
- cross-cluster garbage collection;
- integration with cluster-registry;
- service accounts and RBAC generators;
- validating and mutating webhooks;
- leader election;
- more tests.