Introducing the Istio Operator for Kubernetes · Banzai Cloud


The service mesh is undoubtedly one of the hottest and most hyped topics recently. It may seem that everywhere you turn you can hear heated arguments between developers who think that the service mesh can outgrow even Kubernetes itself and naysayers who are truly convinced that outside a few very large companies it doesn’t even make sense. As always, the truth probably lies somewhere in the middle, but you certainly can’t avoid the question, especially if you’re a Kubernetes distribution and platform provider like us.

Including Istio in the Pipeline platform was one of the most requested features in the last few months so it was no question that we’ll need to put great effort in enabling it. And while working hard on providing a great user experience, we’ve realized that even the management of Istio on a Kubernetes cluster can be quite challenging. To overcome these challenges we’ve built an operator that encapsulates the management of the various components of an Istio service mesh.

We quickly recognized that we were not the only ones struggling with these kind of problems (see for example #9333 on Github), so we’ve decided to open source our work.

Now we are very excited to announce the alpha release of our Istio operator, and we hope that a great community can help drive the innovation in the future.

Motivation

Currently Istio is by far the most mature service mesh solution on Kubernetes (we really like the concept of Linkerd2, and hope that it can grow well, but it’s not there yet), so it was clear that it is the service mesh we want to support in Pipeline first.

By creating the operator, our main goal was to simplify the deployment and management of the Istio components. The first release can replace the Helm chart as the way to install Istio, while providing a few additional features as well.

Istio has a great Helm chart, but we can think of several reasons why an operator could be better:

  • Helm is “only” a package manager that only cares about the components of Istio until those are first deployed to a cluster. An operator is continuously reconciling the state of the components and keeps them healthy.
  • The Helm chart is very complex and hard to read, we prefer describing everything as code instead of yamls.
  • Several small convenience features can be implemented, that are otherwise need to be done manually (like labelling the namespaces where we want sidecar auto-injection)
  • Some complex Istio features - like multi-cluster or canary releases - can be simplified by higher level concepts.

Current features

This is only an alpha release, so the feature set is quite limited, but it can do a few handy things already:

1. Manage the Istio components - no Helm needed

The project follows the convention of a standard operator and has a corresponding custom resource definition that describes the desired state of the Istio deployment. It contains all the configuration values, and if one or more of these is changed, the operator automatically reconciles the state of the components to match the new desired state. The same reconciliation happens if the current state of the components changes for some reason - be it an incidental deletion of a necessary resource, or a change in the cluster infrastructure.

2. Multi cluster federation

Multi-cluster federation functions by enabling Kubernetes control planes running a remote configuration to connect to one Istio control plane. Once one or more remote Kubernetes clusters are connected to the Istio control plane, Envoy can then communicate with the single Istio control plane and form a mesh network across multiple Kubernetes clusters.

The operator takes care of deploying Istio components to the remote clusters and also provides a constant sync mechanism to provide reachability of Istio’s central components from remote clusters. The DNS entries are automatically updated if a pod restart or some other failure happens on the Istio control plane.

The main requirements for multi-cluster federation to work are that all pod CIDRs must be unique and routable to each other in every cluster and also the API servers must be routable to each other. We want to further enhance this feature in future releases, and we’ll also write a detailed blog post soon, but until then if you’re interested and want to try it out, you can find an example here

3. Manage automatic sidecar-injection

The Istio configuration keeps track of namespaces where auto sidecar-injection is enabled, so you’ll be able to manage it from one central place instead of labelling namespaces one-by-one. The injection label is also reconciled, so if you take a namespace out of the list, auto-injection will be turned off.

4. mTLS and control plane security

You can simply turn these on/off in the configuration while your mesh is already in place. This type of configuration change is one of the best features of an operator - there’s no need to reinstall the charts and/or manually delete and reconfigure internal Istio custom resources like MeshPolicies or default DestinationRules. Just rewrite the config and the operator will take care of the rest.

Try it out!

The operator installs the 1.0.5 version of Istio, and requires kubectl 1.13.0 and can run on Minikube v0.33.1+ and Kubernetes 1.10.0+.

Of course you’ll need a Kubernetes cluster first - you can create one using Pipeline in your datacenter or in one of the 6 cloud providers we support.

To try out the operator, point KUBECONFIG to your cluster and simply run this make goal from the project root.

git clone git@github.com:banzaicloud/istio-operator.git
make deploy

This command will install the custom resource definition in the cluster and will deploy the operator in the istio-system namespace. Following the pattern of operators, you can specify your Istio configurations in a Kubernetes custom resource. Once you apply it to your cluster, the operator will start reconciling all the Istio components.

cat <<EOF | kubectl apply -n istio-system -f -
apiVersion: istio.banzaicloud.io/v1beta1
kind: Istio
metadata: labels: controller-tools.k8s.io: "1.0" name: istio
spec: mtls: false includeIPRanges: "*" excludeIPRanges: "" autoInjectionNamespaces: - "default" controlPlaneSecurityEnabled: false
EOF 

After some time you should see that the Istio pods are running:

kubectl get pods -n istio-system --watch
NAME READY STATUS RESTARTS AGE
istio-citadel-7fcc8fddbb-2jwjk 1/1 Running 0 4h
istio-egressgateway-77b7457955-pqzt8 1/1 Running 0 4h
istio-galley-94fc98cd9-wcl92 1/1 Running 0 4h
istio-ingressgateway-794976d866-mjqb7 1/1 Running 0 4h
istio-operator-controller-manager-0 2/2 Running 0 4h
istio-pilot-6f988ff756-4r6tg 2/2 Running 0 4h
istio-policy-6f947595c6-bz5zj 2/2 Running 0 4h
istio-sidecar-injector-68fdf88c87-zd5hq 1/1 Running 0 4h
istio-telemetry-7b774d4669-jrj68 2/2 Running 0 4h

And that the Istio custom resource is showing Available in its status field:

$ kubectl describe istio -n istio-system istio
Name: istio
Namespace: istio-system
Labels: controller-tools.k8s.io=1.0
Annotations: <none>
API Version: istio.banzaicloud.io/v1beta1
Kind: Istio
Metadata: Creation Timestamp: 2019-02-26T12:23:54Z Finalizers: istio-operator.finializer.banzaicloud.io Generation: 2 Resource Version: 103778 Self Link: /apis/istio.banzaicloud.io/v1beta1/namespaces/istio-system/istios/istio UID: 62042df2-39c1-11e9-9464-42010a9a012e
Spec: Auto Injection Namespaces: default Control Plane Security Enabled: true Include IP Ranges: * Mtls: true
Status: Error Message: Status: Available
Events: <none>

Contributing

If you’re interested in the project we are very happy to accept contributions.

  • You can open an issue describing a feature request or a bug
  • You can send a pull request - we’ll do our best to review and accept it as quickly as possible
  • You can help new users with issues they may encounter
  • Or just star the repo to support the development of this project

Read on if you’re interested in contributing to the code, otherwise feel free to skip to the Roadmap section.

Building and testing the project

The operator is built using the Kubebuilder project. It follows the standards described in Kubebuilder’s documentation, the Kubebuilder book, but some default make goals and build commands are changed.

To build the operator and run the tests:

make vendor # runs `dep` to fetch the dependencies in the `vendor` folder
make # generates code if needed, builds the project and runs the tests

If you’d like try your changes in a cluster, create your own image, push it to docker hub and deploy it to your cluster:

make docker-build IMG={YOUR_USERNAME}/istio-operator:v0.0.1
make docker-push IMG={YOUR_USERNAME}/istio-operator:v0.0.1
make deploy IMG={YOUR_USERNAME}/istio-operator:v0.0.1

To watch the operator’s logs, use:

kubectl logs -f -n istio-system $(kubectl get pod -l control-plane=controller-manager -n istio-system -o jsonpath={.items..metadata.name}) manager

To let the operator set up Istio in your cluster, you should create the custom resource describing the desired configuration. There is a sample custom resource under config/samples:

kubectl create -n istio-system -f config/samples/istio_v1beta1_istio.yaml

Roadmap

Please note that the Istio operator is under heavy development and new releases might introduce breaking changes. We are striving to keep backward compatibility as much as possible while adding new features at a fast pace. Issues, new features or bugs are tracked on the project’s GitHub page - feel free to add yours!

Some of the significant features and future items from the roadmap:

1. Integration with Prometheus, Grafana and Jaeger

The Istio Helm chart can install and configure these components, but we believe that it should not be the responsibility of Istio to manage these components, but instead provide easy ways to integrate with external deployments. In a real world scenario developers usually already have their own way of managing those components - like the Prometheus operator configured with other rules and targets. In our case Pipeline installs and automates the configuration of all these components during cluster create or deployments.

2. Manage missing components like Servicegraph or Kiali

The alpha version only installs the core components of Istio, but we want to support add-ons like Servicegraph and Kiali. Any other components you might suggest and would like to see support for?

3. Support for Istio 1.1 and easy upgrades

The Istio 1.1 version is coming soon and will contain some major changes. We want to support the new version as soon as possible and we want to make it easy to upgrade from the current 1.0.5 and 1.0.6 versions. Upgrading to a new Istio version now involves manual steps, like changing all the old sidecars by re-injecting them. We are working to add support for automatic and seamless Istio version upgrades.

4. Adding a “Canary release” feature

Doing a canary release with Istio is one of the most used features of Istio, but it’s quite complex and repetitive. Handling traffic routing in Istio custom resources and tracking monitoring errors is something that can be automated. We’d like to create a higher level Canary resource where the configurations can be described and the operator can manage the whole canary lifecycle based on that.

5. Providing enhanced multi cluster federation

The current multi cluster support in the operator already contains a lot of simplifications compared to the “Helm-way” of doing it, but has some hard requirements like having a flat network where pod IPs are routable from one cluster to another. We are working to add gateway federation to the Istio operator.

6. Security improvements

Currently the operator needs full admin permissions in a Kubernetes cluster to be able to work properly - the corresponding role is generated by make deploy. The operator creates a bunch of additional roles for Istio, so because of privilege escalation prevention, the operator needs to have all the permissions contained in those roles, or from Kubernetes 1.12 it can have the escalate permission for roles and clusterroles.

Another security problem is that Istio proxies are running in privileged mode. We’ll put effort in getting rid of this requirement soon.

Banzai Cloud’s Pipeline provides a platform which allows enterprises to develop, deploy and scale container-based applications. It leverages best-of-breed cloud components, such as Kubernetes, to create a highly productive, yet flexible environment for developers and operations teams alike. Strong security measures—multiple authentication backends, fine-grained authorization, dynamic secret management, automated secure communications between components using TLS, vulnerability scans, static code analysis, CI/CD, etc.—are a tier zero feature of the Pipeline platform, which we strive to automate and enable for all enterprises.

If you are interested in our technology and open source projects, follow us on GitHub, LinkedIn or Twitter:

Star