How To Adjust Size Of A Kubernetes Cluster Using Cluster Autoscaler


Author profile picture

Spawning an AWS EKS cluster has never been easier and options are many: CloudFormation, Terraform or CDK. For the lazy, you can even use the great CLI utility eksctl from Weavework.

But once your cluster is set up, how to assure the number of nodes is scaling properly? In this post, we will setup a new cluster from scratch using CDK and take a look at the Cluster Autoscaler component

⚙️ AWS Setup

First you will need an AWS account. You can create a new account on the sign-in page or reuse an existing account. Be aware that:

EKS is not included in the AWS free tier, meaning you will be billed for the managed Kubernetes control plane.You can use t3.micro instances for free with limitations

Choosing the node instance type

For an EKS setup, you certainly don’t want to use these free t3.micro instances:

  • t2/t3 instances are capped in term of CPU use
  • the default AWS CNI plugin assign one IP from the subnet for each pods, meaning the number of pods running on a node is limited by the networking capacity of the node.

For example, a t3.micro instance supports up to 4 pods

(2 ENI * (2 IP — 1) + 2)
which is simply too limited. You can find the number of supported pod by instance type here.

To keep it simple and cheap, we will use t3.medium instances which supports up to 17 pods by node. They are not free but quite inexpensive.

By the way, if you want to bypass the pod networking limitation, you need to opt out the AWS CNI plugin and use another plugin, for example Calico.

Create an admin user

Operating infrastructure with CDK requires extensive rights. To keep it simple, create a new user and grant it the managed AdministratorAccess policy. In real life you may want to create a custom IAM policy tailored to your needs.

Install CDK

Follow through the official documentation for getting started with CDK.

📝 Write the CDK stack

Let’s begin by creating a dedicated VPC and a basic cluster:

const vpc = new ec2.Vpc(this, "eks-vpc");
const cluster = new eks.Cluster(this, "Cluster", { vpc: vpc, defaultCapacity: 0, // we want to manage capacity ourselves version: eks.KubernetesVersion.V1_17,
});

There is still no compute resources allocated. Let’s add a default managed node group for this cluster using between 1 and 3 instances:

const ng = cluster.addNodegroup("nodegroup", { instanceType: new ec2.InstanceType("t3.medium"), minSize: 1, maxSize: 3,
});

Managed node groups are automatically kept up to date by AWS and are created in an Autoscaling Group.

🎬 Run the stack

First you will need to bootstrap the CDK tool, creating a bucket with CloudFormation manifests, then run the stack. Take a ☕️ as it will take more than 10 minutes.

🔑 Get your Kubernetes credentials

You should get back some outputs with the AWS CLI command to generate you kubectl configuration:

CdkStack.testclusterClusterConfigCommand1851F735 = aws eks update-kubeconfig --name testclusterCluster00507BD3-639846f8ec5241a69f54eabd38c730a0 --region us-east-1 --role-arn arn:aws:iam::xxx:role/CdkStack-testclusterClusterMastersRoleAAD0ED84-DR14A5TYS195

Note that a master role has been created for you. Run this command to generate your kubectl configuration.

Then validate you can access your new cluster with a simple command:

kubectl get no NAME STATUS ROLES AGE VERSION
ip-10-0-165-141.ec2.internal Ready <none> 48m v1.17.9-eks-4c6976

We have now a working cluster with one node like expected.

⚖️ Autoscale the cluster

Our nodes are part of an Autoscaling Group which allow us to easily change the number of running nodes from 1 to 3.

Of course, we want to automate this scaling depending on the load. One classical way of doing this is to add some automatic scaling rules based on CPU, RAM or other custom metrics.

But let’s consider a basic use case:

  1. You want to run a pod with high memory or CPU requirements
  2. Your current nodes are underutilized so there’s no memory or CPU pressure to trigger your Autoscaling Group rules
  3. You submit the pod manifest
  4. The scheduler cannot find a node large enough to accommodate the pod even if there is plenty of capacity in global
  5. You pod is left unscheduled!

To solve such a use case we need a more Kubernetes specific approach to autoscaling that can detect unscheduled pods. Let’s enter the Cluster Autoscaler.

The Cluster Autoscaler is capable of detecting unscheduled pods due to resource constraints and increase the node count accordingly. On the opposite, when nodes are underutilized, pods are rescheduled on other nodes and the node count is decreased.

🛠 Installing the Cluster Autoscaler

Because we want the Cluster Autoscaler be a a fundamental part of our cluster, we can add it to our CDK stack.

Let’s add some code. Main steps are:

  1. To create a policy allowing the Cluster Autoscaler to do his job
  2. Attach this policy to the managed node group role
  3. Tag the nodes with the properly to allow the autoscaler to auto-discover them
  4. Install the Cluster Autoscaler manifest

Another option would have been to use Helm to install the autoscaler manifests but it’s simpler this way as everything is in the same place.

enableAutoscaling(cluster: eks.Cluster, ng: eks.Nodegroup, version: string = "v1.17.3") { const autoscalerStmt = new iam.PolicyStatement(); autoscalerStmt.addResources("*"); autoscalerStmt.addActions( "autoscaling:DescribeAutoScalingGroups", "autoscaling:DescribeAutoScalingInstances", "autoscaling:DescribeLaunchConfigurations", "autoscaling:DescribeTags", "autoscaling:SetDesiredCapacity", "autoscaling:TerminateInstanceInAutoScalingGroup", "ec2:DescribeLaunchTemplateVersions" ); const autoscalerPolicy = new iam.Policy(this, "cluster-autoscaler-policy", { policyName: "ClusterAutoscalerPolicy", statements: [autoscalerStmt], }); autoscalerPolicy.attachToRole(ng.role); const clusterName = new CfnJson(this, "clusterName", { value: cluster.clusterName, }); cdk.Tag.add(ng, `k8s.io/cluster-autoscaler/${clusterName}`, "owned", { applyToLaunchedInstances: true }); cdk.Tag.add(ng, "k8s.io/cluster-autoscaler/enabled", "true", { applyToLaunchedInstances: true }); new eks.KubernetesManifest(this, "cluster-autoscaler", { cluster, manifest: [ { apiVersion: "v1", kind: "ServiceAccount", ... }], // full code is available here: https://gist.github.com/esys/bb7bbeb44565f85f48b3112a8d73a092 });
}

Note: we use the 

CfnJson
 object to wrap the clusterName as a CDK token (a value that is still not resolved) cannot be used as key in tags. By doing this, the tag operation is delayed until the clusterName is resolved.

Update the CDK stack again with cdk deploy and check everything is installed correctly:

kubectl -n kube-system get all -l app=cluster-autoscaler
NAME READY STATUS RESTARTS AGE
pod/cluster-autoscaler-57dfd566f9-27j29 1/1 Running 0 70m
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/cluster-autoscaler 1/1 1 1 103m
NAME DESIRED CURRENT READY AGE
replicaset.apps/cluster-autoscaler-57dfd566f9 1 1 1 103m

🏋🏼‍♂️ Testing the autoscaling capability

First create a Deployment manifest for the nginx image. Note that we require from Kubernetes to guarantee a comfortable amounts of CPU and RAM (see the limit/requests section) for each pod.

apiVersion: apps/v1
kind: Deployment
metadata:
 name: nginx-scale
spec:
 replicas: 1
 selector:
 matchLabels:
 app: nginx
 template:
 metadata:
 labels:
 service: nginx
 app: nginx
 spec:
 containers:
 - image: nginx
 name: nginx-scale
 resources:
 limits:
 cpu: 500m
 memory: 512Mi
 requests:
 cpu: 500m
 memory: 512Mi

And apply it on your cluster, you should have exactly one replica of nginx running:

kubectl apply -f nginx-deploy.yaml kubectl get po -l app=nginx
NAME READY STATUS RESTARTS AGE
nginx-scale-69644568d9-t995l 1/1 Running 0 68s

Let’s scale up the number of replicas and watch what happens:

kubectl scale deploy nginx-scale --replicas=5
> kubectl get po -l app=nginx -w
NAME READY STATUS RESTARTS AGE
nginx-scale-69644568d9-ng7nj 1/1 Running 0 27s
nginx-scale-69644568d9-pv62h 0/1 Pending 0 27s
nginx-scale-69644568d9-sng2v 0/1 Pending 0 27s
nginx-scale-69644568d9-t995l 1/1 Running 0 6m42s
nginx-scale-69644568d9-xfwkf 1/1 Running 0 27s

Of the 5 requested replicas, 2 stay pending. If you describe the pending pods, you should see there is not enough CPU:

kubectl describe po nginx-scale-69644568d9-pv62h
...
0/1 nodes are available: 1 Insufficient cpu.

If you look in the Cluster Autoscaler logs, you can observe that scaling up to 2 instances should be in progress:

kubectl -n kube-system logs deployment/cluster-autoscaler | grep Scale-up
I0904 13:43:55.584100 1 scale_up.go:700] Scale-up: setting group eks-4eba2c80-0b01-9821-207d-572d47bedd1a size to 2
...

Pending pods should soon be running and the nodes count is appropriately scaled to 2 🎉

kubectl get no
NAME STATUS ROLES AGE VERSION
ip-10-0-165-141.ec2.internal Ready <none> 107m v1.17.9-eks-4c6976
ip-10-0-228-170.ec2.internal Ready <none> 7m19s v1.17.9-eks-4c6976

The reverse should also be true: if you scale down the number of replicas, your cluster should shrink after a few minutes.

🧹 Cleaning up

Simply destroy the CDK stack by running the cdk destroy command. Note that CDK bootstrapping resources have to be cleaned by hand!

Thank you for reading! 🙇‍♂️ You can find the complete CDK code example here.

Also published at https://medium.com/@emmanuel.sys/spawning-an-autoscaling-eks-cluster-52977aa8b467

The Noonification banner

Subscribe to get your daily round-up of top tech stories!