OYO grew from an $850 million company in September 2017 to a $5 billion company in September 2018. At the same time, OYO Tech grew 4 folds at from about 80 engineers to over 300. With all this amazing brainpower at our disposal, we started to look at things differently and made quite a few changes to our tech stack.
At OYO Tech, we rely heavily on AWS’ Elastic Beanstalk service to quickly deploy scalable and manageable services, monolith and microservices alike. As we grew, while Elastic Beanstalk served us well in production, where changes go after thorough integration testing, we found that it was much different in our staging environment, where each service had multiple Beanstalk environments created for each of the owner team’s developer.
Slowly we found ourselves with over 350 servers in our staging environment alone. That’s higher than the number of developers we have! With low resource utilization, elastic beanstalk stopped serving the cloud’s primary purpose, we were paying for services we weren’t using.
General Rule of Cloud Computing: Don’t pay for resources you aren’t going to use!
- If 2 (or more!) developers are working on the same code base sharing a single EB environment, one developer’s deployment will disrupt the other’s work.
- Deployments on Elastic Beanstalk is slow and can take a while to complete. This disrupts productivity.
- Getting new infrastructure provisioned for a new service took some time, again, disrupting productivity.
The only disruptions we like are us disrupting industry standards with innovative solutions
We deployed a Kubernetes Cluster with kOps doing most of the heavy lifting for us. A master node controls an autoscaling group spread across 3 AZs with big r4.4xlarge Spot Fleet. Spot Instances are AWS’ special offering where we are able to get large unused instances at considerably less cost. It has been so less that we have been able to get r4.4xlarge instances for a price lesser than t2.large!
With custom-made helm charts and Jenkins job, we created a one-click solution for our developers to easily deploy their diverse applications on Kubernetes without having to go deep into it’s working.
The Jenkins pipeline would download the project from Github, Create a docker image based on the Dockerfile in the project, deploy the project on the kubernetes cluster, run any automated tests, and give an endpoint for the deployment to the developer where they can test their code further. This can be integrated well with one project dependant on another since the endpoint is generated based on the options chosen by the developer for deployment, where their team name, as well as a deployment number, helps create a unique endpoint for the service.
Going with our philosophy to make judicious use of all resources, we also include an expiry time on each deployment, so the cluster automatically removes a deployment after a specific number of hours (minimum 1, maximum 72).
Shifting to Kubernetes was a multi-step process. We first migrated our common monolith application to Kubernetes and provided developers the power to deploy hundreds of pods of the monolith for their use. This enabled us to reduce our Elastic Beanstalk environment count significantly. This resulted in fewer resources going underutilized or unutilized.
We then started the migration of all the microservices to Kubernetes as well. After a while, we saw the number of servers on our staging environment reduce to just 40% of their original count!
This also brought in higher flexibility to our developers in terms of application deployments and the number of applications we could have in our staging environment. Now there is no upper limit for the number of applications that can be deployed on staging.
The biggest impact this made was with our Staging Environment’s cost, where we were able to decrease the overall server count to just a little over 100, running essential services with high utilization.