So, what is OpenTracing? It’s a vendor-agnostic API to help developers easily instrument tracing into their code base. It’s open because no one company owns it. In fact, many tracing tooling companies are getting behind OpenTracing as a standardized way to instrument distributed tracing.
OpenTracing wants to form a common language around what a trace is and how to instrument them in our applications. In OpenTracing, a trace is a directed acyclic graph of Spans with References that may look like this (from their website):
[Span A] ←←←(the root span) | +------+------+ | | [Span B] [Span C] ←←←(Span C is a `ChildOf` Span A) | | [Span D] +---+-------+ | | [Span E] [Span F] >>> [Span G] >>> [Span H] ↑ ↑ ↑ (Span G `FollowsFrom` Span F)
This allows us to model how our application calls out to other applications, internal functions, asynchronous jobs, etc. All of these can be modeled as Spans, as we’ll see below.
For example, if I have a consumer website where a customer places orders, I make a call to my payment system and my inventory system before asynchronously acknowledging the order. I can trace the entire order process through every system with an OpenTracing library and can render it like this:
––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–––––––|–> time [Place Order···················································] [Receive Payment·····] [Fulfill Order··] [Email Order...]
Each one of these bracketed blocks is a Span representing a separate software system communicating over messaging or HTTP. You can find a deeper overview on OpenTracing with great visuals here.
Why Should I Care About OpenTracing?
Why, as software team members, should we even care about OpenTracing? For that matter, why should we even care about tracing in general? A previous post answers the latter question in depth. In short, tracing is becoming more and more important because software systems are becoming more and more distributed and complex. We need ways to correlate them so that we can understand what is happening inside them. When we know what’s happening inside them, we can quickly hunt down defects and other incidents. Good distributed tracing tooling can save you hours or days of frustration.
Now, why is OpenTracing good for this? Let’s put it this way: there are numerous tracing tools out there, but many software teams don’t use one or don’t know about them. Tracing is reaching a tipping point, and that means these tools will be constantly evolving. Because of this, some tools will be more useful to your team than others, and this may change over time. Do you want to be locked into a specific tool because you’ve coupled numerous pieces of code to their libraries? I’d hope not!
OpenTracing provides a specification that these tools can adopt so that development teams can remain free to use the best tool for their current state. It’s also good for these tools because it lets developers dive headfirst into trying them out with little risk. Otherwise, a developer may steer clear or take months to decide which tool to use. This is similar to HTTP. Because HTTP isn’t owned by any web server company, developers have the flexibility to change how they host their services, knowing that they can still communicate with the rest of the world with ease. Without this, the World Wide Web may never have been born.
Who Owns It?
OpenTracing is owned and managed by a council of individuals from various companies and organizations. They intentionally maintain a diversified set of interests so that they can ensure the specification meets all appropriate needs. This council frequently meets with an advisory board of tracing experts to review the specification and gather input.
Let’s talk a bit about the components of the OpenTracing API. It’s fairly straightforward, but I’ll also link to places where you can explore each concept in depth. At the time this post was written, the most recent accepted version of the OpenTracing specification was 1.1. All example code will be in C#.
This tracer is the entry point into the tracing API. It gives us the ability to create Spans. It also lets us extract tracing information from external sources and inject information to external destinations. You can find more information here.
//create spans var span = tracer.BuildSpan("noop").Start(); //extract external tracing information var spanContext = tracer.Extract(BuiltinFormats.TextMap, new TextMapExtractAdapter(carrier)); //inject current tracing information tracer.Inject(span.Context, BuiltinFormats.TextMap, new TextMapInjectAdapter(carrier));
This represents a unit of work in the Trace. For example, a web request that initiates a new Trace is called the root Span. If it calls out to another web service, that HTTP request would be wrapped within a new child Span. Spans carry around a set of tags of information pertinent to the request being carried out. You can also log events within the context of a Span. They can support more complex workflows than web requests, such as asynchronous messaging. They have timestamps attached to them so we can easily construct a timeline of events for the Trace. More information is available here.
//example span IScope scope = tracer.BuildSpan("send") .WithTag(Tags.SpanKind.Key, Tags.SpanKindClient) .WithTag(Tags.Component.Key, "example-client") .StartActive(finishSpanOnDispose:true)
The SpanContext is the serializable form of a Span. It lets Span information transfer easily across the wire to other systems.
So far, Spans can connect to each other via two types of relationship: ChildOf and FollowsFrom. ChildOf Spans are spans like in our previous example, where our ordering website sent child requests to both our payment system and inventory system. FollowsFrom Spans are just a chain of sequential Spans. So, a FollowsFrom Span is just saying, “I started after this other Span.”
How Do I Use OpenTracing in My Application?
One wonderful thing about OpenTracing is that once you learn it, you can apply it across a variety of tools and libraries. The OpenTracing site has many guides, depending on your language of choice. It also has an excellent set of best practices that cover a wide variety of communication patterns. It’s almost guaranteed that they list the patterns your application follows.
I’ll also give my two cents on instrumenting tracing here:
- Use dependency injection where possible. This will make things easily testable and configurable.
- Follow the idioms of your language and frameworks as much as possible. This will let your team members easily onboard into OpenTracing and tracing in general.
- Many frameworks provide extensibility points around units of work. For example, Spring Boot has pre- and post-request handlers for web requests. Leverage these as much as possible to save you effort when instrumenting tracing.
- If you don’t have a framework or can’t use its extensibility points, keep most of the tracing instrumentation as isolated from the business logic as possible. You can use patterns like Decorator and Chain of Responsibility for this, or even aspect-oriented programming. This will make your code’s intent clearer and easier to read. Exceptions to this include when you need to add tags or log an event; these are commonly specific to the business logic your code is executing.
What OpenTracing Is Not
We covered quite a bit about OpenTracing, and I hope you find it valuable. But I want to be clear what OpenTracing is not. It isn’t:
- A specific program or implementation of tracing to download and run.
- One specific tool to use.
- Specification on how to serialize this information across the wire. For that sort of effort, you can look into things like W3C’s proposed trace context specification.
Be Open to Tracing
Tracing is going to continue to be important for our jobs to keep our systems running and to hunt down nefarious bugs. Serverless infrastructure, microservices, and other such distributed patterns will keep us operationally busy for years to come. OpenTracing strives for us to agree on a way to tackle our operations head on. It lets us agree on a way to add information to our system that we can share with others. We can ultimately use this to see clearly what’s happening in our entire software ecosystem.
If you’re struggling with finding the cause of incidents in your system, consider using a tool or library that supports OpenTracing. You now have the knowledge you need to easily learn and harness it.
This post was written by Mark Henke. Mark has spent over 10 years architecting systems that talk to other systems, doing DevOps before it was cool, and matching software to its business function. Every developer is a leader of something on their team, and he wants to help them see that.