Grafana is a popular tool used to visualize time series data, but as with most open source tools, it is not highly available right out of the box (errr… container). Though with a few tweaks it is not hard to create a load balanced Grafana cluster utilizing the high availability and scalable features of Amazon Aurora. All without even having to run an EC2 instance.
Our team recently migrated Sportsline.com from a datacenter VM infrastructure into AWS using Fargate, Elasticache, and Aurora. When it came to monitoring this new infrastructure, we wanted to maintain the no server approach that we were using for the web application. Having zero EC2 instances in an account that you manage makes life much easier, and now the operation folks can focus more on making the development team more efficient and not fool with patch management, ssh key rotation, ebs volume management, etc. I believe that containers as a service is the future, and there will be two camps. The ones that run large kubernetes clusters and those that allow the big public cloud providers run their containers for them.
We were already running an instance of Grafana on ECS for some of our monitoring of 247Sports.com, but we have to utilize a volume, mounted to the ECS host, for the purpose of maintaining the state of the SQLite database that Grafana uses by default. Another issue with using the local database is that you can only run a single instance of Grafana behind a load balancer. Most of the time this is sufficient, but when there is an issue going on and everyone on the team is trying to view the graphs, it can cause the container to get overwhelmed.
In this post, we will cover this Terraform Module, setting up a fully operational Grafana that is load balanced and backed by Aurora (default disclaimer: this module will not run on a free tier and it will cost you a small amount). If you don’t really care how it is put together and you just want to run it, follow the README.
Here is a basic example of the module configuration: