Acceptance testing Go services using aceptadora

By Oleg Zaytsev

Oleg Zaytsev

Today I would like to share with you our experience (and some code) on writing acceptance tests at Cabify. Each time we deliver some broken code to production, thousands of drivers lose their ability to work, and since the ride-hailing business usually doesn’t wait for your rollback, the opportunity to serve the lost journeys is gone forever.

Sometimes the unit tests are just not enough, as they rely on a human correctly understanding the contract it’s testing. There’s no value in asserting that your code correctly sends a query to MySQL if that query doesn’t match the database schema deployed.

Despite the risks, we do deploy our code multiple times per day and we often deploy code powering features that we don’t even know how to reproduce manually in a testing environment. Some of our teams don’t do manual testing at all. Instead, we do acceptance testing.

We think that extensive acceptance testing is one of the keys to keep our deployments reliable and agile at the same time. Today, I want to tell you how we do it for Go services.

Image for post
Illustration by Maria Letta

Where we came from

When I joined Cabify back in 2017 I found myself being one of the only two Golang developers in the company, with the exception of our former CTO who didn’t have time to touch the codebase anymore with the company growing by tens of developers every week. At that point my teammate had been working for the company for 6 months, so at the company’s scale he could already be considered a veteran.

We inherited a codebase that was written by a few engineers that were not in the company or were not contributing anymore. The code was originally ported from Node.js so I guess that the gain in robustness of the compiled code justified somehow the lack of tests, as the low amount of coding resources available could now be used to deliver new features instead of debugging JavaScript.

The code wasn’t just poorly tested, it also wasn’t testable, and in order to make it testable we had to refactor it, but in order to refactor it, we had to know what it did, and in order to know what it did, we needed to test it first.

Acceptance tests as documentation

So, we decided to write some acceptance tests for the services that would document their behaviour. Our first decision was to write them in a different language, so there would be no temptation to import/reuse the code we’re refactoring to test it. We chose Ruby since we already had some Ruby code in the company and we could use Cucumber to write tests in Gherkin language. We sent requests to our services, and we wrote the responses as the expectations.

This was our first acceptance test:

Our first acceptance test sent a JSON message over NSQ and expected to receive a message on a different topic.

Around this, we loaded our fixtures into the databases and started the dependencies (NSQ, Redis, memcached, some other services…) once per test suite. We tried to use for the orchestration in the first place, but it didn’t ensure that that a dependency was started, just that it was starting, and that caused racy behaviours on starting up services. Then we switched to a set of plain shell scripts that were checking that each container was ready to handle requests before the next one started.

So what we had at this point was a bunch of containers in a Docker network and the container running Cucumber joining that network and interacting with the services.

We lived with that solution for a few years, we split our monorepo and replicated the same approach across multiple services, however, we were facing some issues:

  • We had to write Ruby to write acceptance tests. We’re Go developers and for some of us this was our first contact with Ruby, so you can imagine the quality of the code defining the Gherkin steps.
  • Our tests ran after the test subject and its dependencies had started, so there was no control on how those behave.
  • This lack of control meant that the dependencies and sometimes the test subjects retained their state between tests (we do have some stateful services), which caused some tests to be flaky and unable to run a filtered subset of tests, as many of them depended on the previous test.
  • Debugging tests meant filling them with . We still have a lot of those.
  • The only way to run tests was to run the entire suite from the terminal.
  • When we started to switch to gRPC, it wasn’t as easy as “send or assert this JSON” anymore. We had to carefully craft each mock and each Cucumber step definition for each gRPC method. We didn’t have the time or the desire to carefully craft Ruby code inside of our Go codebase, so we started to write less acceptance tests.

Acceptance tests written in Go

The last point forced us to switch to Go as the language of the acceptance tests: we just changed the to be a image running but the rest of the picture remained the same: that container joined a network with already running services and dependencies, so it still had no control over them. With this change our tests stopped being human-readable Gherkin script, and instead became pure Go tests.

However, we still faced the lack of control, and when we started mocking gRPC services that our test subject intends to call, that caused a major issue: as the joined the network after the service had already started, the service had already failed in dialling the gRPC connection and some of the requests would automatically fail instead of reaching the mocked gRPC server on the . We solved this by building a small library that interacted with the Docker daemon and started the service from the test itself: this is how aceptadora was born.

Once we moved the service startup to the test itself, we were also able to move the dependencies to a standard as we were now able to check their healthiness before starting the service.

Aceptadora

At this point we just decided to refactor aceptadora with ideas in mind:

  • It should remove the need for : we want a single tool that handles everything, not a combination of two tools.
  • It should be easy to understand, that’s why the syntax of is based on
  • It should run in all the environments seamlessly, so it should adapt to the environment it’s running in.
  • It should allow running individual tests.
  • Debugging tests with standard tools would be great, and that implies not running in a container (you can debug tests in a container, but it requires extra configuration).

And since we managed to achieve all those requirements, we decided to share it with the community as an open-source library: github.com/cabify/aceptadora.

What does aceptadora do?

Aceptadora pulls docker images, and handles the containers’ lifecycles. It doesn’t do much more: we decided to leave out the extra code like health checks, etc.

Defining services with aceptadora

We use that is a-like file to define our services:

Notice some details about this:

  • We need to refer to the filesystem elements, env config files and binds, relative to the in order to allow running tests from different folders.
  • Instead of the attribute we use the attribute to create filesystem binds. We’ve reserved the attribute to allow us to define actual volumes in the future.
  • The images have to be referenced using the full canonical path. That is, referring to the image as for example.
  • In the example we expect our service container to be tagged as . This isn’t something standard, but you still need a canonical format.

Loading environment-dependant configuration

Once in the test, the first thing we should probably do is to load the env config specific to our environment, we can do that using :

This is necessary as different environments can have different Docker setups. For instance, in Gitlab with , the services will be bound on the host instead of the usual .

Of course you could load that config before running the test, but loading it from the test lets us run tests with no extra configuration, which is especially valuable when you’ve just cloned the repo and you’re running the tests from your favorite IDE.

Notice also that we provide , which is an instance of , to our test. In order to keep your tests to the point, we don’t return errors from aceptadora: we just make the tests immediately fail if something goes wrong.

Instantiating aceptadora

Instantiating aceptadora is as easy as:

Actually, it’s even easier if you set your config in the we’ve loaded earlier, and use to load it:

The instantiation will have these two side-effects:

  • It will set the env var to the one provided through config.
  • It will set the to the first non-local IPv4 address, unless it already comes set.

Both of them can be used in the yaml file loaded and in all the env configs loaded by the services later.

The is useful if we’re going to mock dependencies of our services in the test itself.

Managing the services’ lifecycles with aceptadora

This is the easiest part, and we’ll use the syntax of Testify’s to describe the most common use case:

As mentioned earlier, the health check functions are not included in aceptadora as they really depend on each use case: Redis is ready to serve when it accepts tcp connections, while MySQL is not, and it has to respond to a ping request in order to be considered healthy.

Extra considerations

Some services may be slow to start in each test, like MySQL: consider having two instances of aceptadora: one for your test suite, and one for each of your tests. Truncating MySQL tables should be enough for most use cases.

Although aceptadora will fail if it can’t bind a port for a dependency, e.g. a Redis, disasters and misconfigurations can happen: imagine that you forgot to bind a port for your Redis, your test starts, and decides to run a . In the best case there’s nothing running on your local 6379 port and it will just fail. In the worst case, there’s a tunnel to a production instance on that port, which for some reason didn’t ask for authentication. Consider checking in some way that you’re interacting with a tested instance (a for Redis, a in MySQL).

Conclusions

Aceptadora is a set of boilerplate code that will make your acceptance tests go straight to the point, instead of distracting you with container interactions.

We extracted it as a library to make acceptance testing uniform across multiple services, and we expect it will help people out to improve the reliability of their deployments.