When you visit a secure website, it offers you a TLS certificate that asserts its identity. Every certificate has an expiration date, and when it’s passed due, it is no longer valid. The idea is almost as old as the web itself: limiting the lifetime of certificates is meant to reduce the risk in case a TLS server’s secret key is compromised.
Certificates aren’t the only cryptographic artifacts that expire. When you visit a site protected by Cloudflare, we also tell you whether its certificate has been revoked (see our blog post on OCSP stapling) — for example, due to the secret key being compromised — and this value (a so-called OCSP staple) has an expiration date, too.
Thus, to determine if a certificate is valid and hasn’t been revoked, your system needs to know the current time. Indeed, time is crucial for the security of TLS and myriad other protocols. To help keep clocks in sync, we are announcing a free, high-availability, and low-latency authenticated time service called Roughtime, available at roughtime.cloudlare.com on port 2002.
Time is tricky
It may surprise you to learn that, in practice, clients’ clocks are heavily skewed. A recent study of Chrome users showed that a significant fraction of reported TLS-certificate errors are caused by client-clock skew. During the period in which error reports were collected, 6.7% of client-reported times were behind by more than 24 hours. (0.05% were ahead by more than 24 hours.) This skew was a causal factor for at least 33.5% of the sampled reports from Windows users, 8.71% from Mac OS, 8.46% from Android, and 1.72% from Chrome OS. These errors are usually presented to users as warnings that the user can click through to get to where they’re going. However, showing too many warnings makes users grow accustomed to clicking through them; this is risky, since these warnings are meant to keep users away from malicious websites.
Clock skew also holds us back from improving the security of certificates themselves. We’d like to issue certificates with shorter lifetimes because the less time the certificate is valid, the lower the risk of the secret key being exposed. (This is why Let’s Encrypt issues certificates valid for just 90 days by default.) But the long tail of skewed clocks limits the effective lifetime of certificates; shortening the lifetime too much would only lead to more warnings.
Endpoints on the Internet often synchronize their clocks using a protocol like the Network Time Protocol (NTP). NTP aims for precise synchronization, and even accounts for network latency. However, it is usually deployed without security features, as the added overhead on high-load servers degrades precision significantly. As a result, a man-in-the-middle attacker between the client and server can easily influence the client’s clock. By moving the client back in time, the attacker can force it to accept expired (and possibly compromised) certificates; by moving forward in time, it can force the client to accept a certificate that is not yet valid.
Fortunately, for settings in which both security and precision are paramount, workable solutions are on the horizon. But for many applications, precise network time isn’t essential; it suffices to be accurate, say, within 10 seconds of real time. This observation is the primary motivation of Google’s Roughtime protocol, a simple protocol by which clients can synchronize their clocks with one or more authenticated servers. Roughtime lacks the precision of NTP, but aims to be accurate enough for cryptographic applications, and since the responses are authenticated, man-in-the-middle attacks aren’t possible.
The protocol is designed to be simple and flexible. A client can get Roughtime from just one server it trusts, or it may contact many servers to make its calculation more robust. But its most distinctive feature is that it adds accountability to time servers. If a server misbehaves by providing the wrong time, then the protocol allows clients to produce publicly verifiable, cryptographic proof of this misbehavior. Making servers auditable in this manner makes them accountable to provide accurate time.
We are deploying a Roughtime service for two reasons.
First, the clock we use for this service is the same as the clock we use to determine whether our customers’ certificates are valid and haven’t been revoked; as a result, exposing this service makes us accountable for the validity of TLS artifacts we serve to clients on behalf of our customers.
Second, Roughtime is a great idea whose time has come. But it is only useful if several independent organizations participate; the more Roughtime servers there are, the more robust the ecosystem becomes. Our hope is that putting our weight behind it will help the Roughtime ecosystem grow.
The Roughtime protocol
At its most basic level, Roughtime is a one-round protocol in which the client requests the current time and the server sends a signed response. The response is comprised of a timestamp (the number of microseconds since the Unix epoch) and a radius (in microseconds) used to indicate the server’s certainty about the reported time. For example, a radius of 1,000,000μs means the server is reasonably sure that the true time is within one second of the reported time.
The server proves freshness of its response as follows. The request consists of a short, random string commonly called a nonce (pronounced /nän(t)s/, or sometimes /ˈen wən(t)s/). The server incorporates the nonce into its signed response so that it’s needed to verify the signature. If the nonce is sufficiently long (say, 16 bytes), then the number of possible nonces is so large that it’s extremely unlikely the server has encountered (or will ever encounter) a request with the same nonce. Thus, a valid signature serves as cryptographic proof that the response is fresh.
The client uses the server’s root public key to verify the signature. (The key is obtained out-of-band; you can get our key here.) When the server starts, it generates an online public/secret key pair; the root secret key is used to create a delegation for the online public key, and the online secret key is used to sign the response. The delegation serves the same function as a traditional X.509 certificate on the web: as illustrated in the figure below, the client first uses the root public key to verify the delegation, then uses the online public key to verify the response. This allows for operational separation of the delegator and the server and limits exposure of the root secret key.
Roughtime offers two features designed to make it scalable. First, when the volume of requests is high, the server may batch-sign a number of clients’ requests by constructing a Merkle tree from the nonces. The server signs the root of the tree and sends in its response the information needed to prove to the client that its request is in the tree. (The data structure is a binary tree, so the amount of information is proportional to the base-2 logarithm of the number of requests in the batch; see the figure below) Second, the protocol is executed over UDP. In order to prevent the Roughtime server from being an amplifier for DDoS attacks, the request is padded to 1KB; if the UDP packet is too short, then it’s dropped without further processing. Check out this blog post for a more in-depth discussion.
The protocol is flexible enough to support a variety of use cases. A web browser could use a Roughtime server to proactively synchronize its clock when validating TLS certificates. It could also be used retroactively to avoid showing the user too many warnings: when a certificate validation error occurs — in particular, when the browser believes it’s expired or not yet valid — Roughtime could be used to determine if the clock skew was the root cause. Instead of telling the user the certificate is invalid, it could tell the user that their clock is incorrect.
Using just one server is sufficient if that server is trustworthy, but a security-conscious user could make requests to many servers; the delta might be computed by eliminating outliers and averaging the responses, or by some more sophisticated method. This makes the calculation robust to one or more of the servers misbehaving.
Making servers accountable
The real power of Roughtime is that it’s auditable. Consider the following mode of operation. The client has a list of servers it will query in a particular order. The client generates a random string — called a blind in the parlance of Roughtime — hashes it, and uses the output as the nonce for its request to the server. For subsequent requests, it computes the nonce as follows: generate a blind, compute the hash of this string and the response from the previous server (including the timestamp and signature), and use this hash as the nonce for the next request.
Creating a chain of timestamps in this way binds each response to the response that precedes it. Thus, the sequence of blinds and signatures constitute a publicly verifiable, cryptographic proof that the timestamps were requested in order (a “clockchain” if you will 😉). If the servers are roughly synchronized, then we expect that the sequence to monotonically increase, at least roughly. If one of the servers were consistently behind or ahead of the others, then this would be evident in the sequence. Suppose you get the following sequence of timestamps, each from different servers:
|ServerA-Roughtime||2018-08-29 14:51:50 -0700 PDT|
|ServerB-Roughtime||2018-08-29 14:51:51 -0700 PDT +0:00:01|
|Cloudflare-Roughtime||2018-08-29 12:51:52 -0700 PDT -1:59:59|
|ServerC-Roughtime||2018-08-29 14:51:53 -0700 PDT +2:00:01|
Servers B and C corroborate the time given by server A, but — oh no! Cloudflare is two hours behind! Unless servers A, B, and C are in cahoots, it’s likely that the time offered by Cloudflare is incorrect. Moreover, you have verifiable, cryptographic proof. In this way, the Roughtime protocol makes our server (and all Roughtime servers) accountable to provide accurate time, or, at least, to be in sync with the others.
The Roughtime ecosystem
The infrastructure for monitoring and auditing the Roughtime ecosystem hasn’t been built yet. Right now there’s only a handful of servers: in addition to Cloudflare’s and Google’s, there’s also a really nice Rust implementation. The more diversity there is, the healthier the ecosystem becomes. We hope to see more organizations adopt this protocol.
Cloudflare’s Roughtime service
For the initial deployment of this service, our primary goals are to ensure high availability and minimal maintenance overhead. Each machine at each Cloudflare location executes an instance of the service and responds to queries using its system clock. The server signs each request individually rather than batch-signing them as described above; we rely on our load balancer to ensure no machine is overwhelmed. There are three ways in which we envision this service could be used:
- TLS authentication. When a TLS application (a web browser for example) starts, it could make a request to roughtime.cloudflare.com and compute the difference between the reported time and its system time. Whenever it authenticates a TLS server, it would add this difference to the system time to get the current time.
- Roughtime daemon. One could implement an OS daemon that periodically requests the current time. If the reported time differs from the system time by more than a second, it might issue an alert.
- Server auditing. As the Roughtime ecosystem grows, it will be important to ensure that all of the servers are in sync. Individuals or organizations may take it upon themselves to monitor the ecosystem and ensure that the servers are in sync with one another.
The service is reachable wherever you are via our anycast network. This is important for a service like Roughtime, because minimizing network latency helps improve accuracy. For information about how to configure a client to use Cloudflare-Roughtime, check out the developer documentation. Note that our initial release is somewhat experimental. As such, our root public key may change in the future. See the developer docs for information on obtaining the current public key.
If you want to see what time our Roughtime server returns, click the button below!
Subscribe to the blog for daily updates on our announcements.
- [BSZ16] B. Dowling, D. Stebila, and G. Zaverucha. “Authenticated Network Time Synchronization.” USENIX Security 2016.