People following me have occasionally seen me post graphs like this:
Usually people leave this type of instrumentation and graphing to NewRelic and Skylight. However, at our scale we find it extremely beneficial to have instrumentation, graphing and monitoring local cause we are in the business of hosting, this is a central part of our job.
Over the past few years Prometheus has emerged as one of the leading options for gathering metrics and alerting. However, sadly, people using Rails have had a very hard time extracting metrics.
Issue #9 on the official prometheus client for Ruby has been open 3 years now, and there is very little chance it will be “solved” any time soon.
This means you must provide a single HTTP endpoint that collects all the metrics you want exposed. This ends up being particularly complicated with Unicorn/Puma and Passenger who usually will run multiple forks of a process. If you simply implement a secured
/metrics endpoint in your app, you have no guarantees over which forked process will handle the request, without “cross fork” aggregation you would just report metrics for a single, random, process. Which is less than useful.
Additionally, knowing what to collect and how to collect it is a bit of an art, it can easily take multiple week just to figure out what you want.
Having solved this big problem for Discourse I spent some time extracting the patterns.
The prometheus_exporter gem is a toolkit that provides all the facilities you need.
It has an extensible collector that allows you to run a single process to aggregate metrics for multiple processes on one machine.
It implements gauge, counter and summary metrics.
It has default instrumentation that you can easily add to your app
It has a very efficient and robust transport channel between forked processes and master collector. The master collector gathers metrics via HTTP but reduces overhead by using chunked encoding so a single session can gather a very large amount of metrics.
It exposes metrics to prometheus over a dedicated port, HTTP endpoint is compressed.
It is completely extensible, you can pick as much or as little as you want.
A minimal example implementing metrics for your Rails app
In your Gemfile:
# in config/initializers/prometheus.rb if Rails.env != "test" require 'prometheus_exporter/middleware' # This reports stats per request like HTTP status and timings Rails.application.middleware.unshift PrometheusExporter::Middleware end
At this point, your web is instrumented, every request will keep track of SQL/Redis/Total time (provided you are using PG)
You may also be interested in per-process stats, like:
# in config/initializers/prometheus.rb if Rails.env != "test" require 'prometheus_exporter/instrumentation' # this reports basic process stats like RSS and GC info, type master # means it is instrumenting the master process PrometheusExporter::Instrumentation::Process.start(type: "master") end
# in unicorn/puma/passenger be sure to run a new process instrumenter after fork after_fork do require 'prometheus_exporter/instrumentation' PrometheusExporter::Instrumentation::Process.start(type:"web") end
Also you may be interested in some Sidekiq stats:
Sidekiq.configure_server do |config| config.server_middleware do |chain| require 'prometheus_exporter/instrumentation' chain.add PrometheusExporter::Instrumentation::Sidekiq end end
FInally, you may want to collect some global stats across all processes, like:
To do so we can introduce a “type collector”:
# lib/global_type_collector.rb unless defined? Rails require File.expand_path("../../config/environment", __FILE__) end require 'raindrops' class GlobalPrometheusCollector < PrometheusExporter::Server::TypeCollector include PrometheusExporter::Metric def initialize @web_queued = Gauge.new("web_queued", "Number of queued web requests") @web_active = Gauge.new("web_active", "Number of active web requests") end def type "app_global" end def observe(obj) # do nothing, we would only use this if metrics are transported from apps end def metrics path = "/var/www/my_app/tmp/sockets/unicorn.sock" info = Raindrops::Linux.unix_listener_stats([path])[path] @web_active.observe(info.active) @web_queued.observe(info.queued) [ @web_queued, @web_active ] end end
After all of this is done you need to run the collector (in a monitored process in production) using runit ,supervisord, systemd or whatever your poison is (mine is runit).
bundle exec prometheus_exporter -t /var/www/my_app/lib/global_app_collector.rb
Then you follow the various guides online and setup Prometheus and the excellent Grafana and you too can have wonderful graphs.
For those curious, here is an partial example of how the raw metric feed looks for an internal app we use that I instrumented yesterday: metrics · GitHub
I hope you find this helpful, good luck instrumenting all things!