🔩 Build Structured Web Socket Backends in Node with Hexnut

By Francis Stokes

http://github.com/francisrstokes/hexnut

Picture this: You’ve been building node backends with Express or Koa, and you’ve found your groove in code organisation, structure and abstractions. Adding routes is always clean and you can do it at warp speed. Then, all of a sudden you have a new requirement involving some realtime functionality. No problem you think, Express and Koa both have web socket extensions. So you set it up, and that’s where the nightmare begins.

  • Your beautiful code structure is now lying in pieces on the ground because sockets are event driven and persistent, unlike your existing routes
  • The middleware patterns you have set up are gone, and you’re now processing, validating, and reacting to raw data that comes in through the socket
  • The whole thing is a sort of tumour on the side of your once graceful app

If any of that sounds familiar to you, then rest assured: You’re not the only one. The truth is realtime apps are really tricky to get right, especially for developers who are used to building stateless REST APIs.

The real problem is that it’s the wrong abstraction level.

What do I mean by that? Essentially, sockets are the lowest level persistent communication mechanism you could conceivably work with on the web. You’re essentially doing two things: Pushing data down the socket, or pulling it out. Compare that with what we’re used to with even the most basic frameworks: Routing, request/response (or context) objects, built in async flow, etc. It’s a far higher abstraction to work with because you can ignore most of the details of what really goes into to dealing with HTTP requests.

Can we bring Web Sockets up to the same level? That’s exactly what I’ve been working on with Hexnut.

Hexnut is a middleware based, super lightweight framework. It can work hand in hand with frameworks like Express and Koa, even sharing the same underlying server object! It only really has two important concepts:

  1. When the client connects to the server, a special Context object is created which lives for the lifetime of the connection. The ctx object (as it’s typically named) is used to communicate with the client, and to build up state as messages go back and forth.
  2. Connections, messages, and closing events are handled in a middleware chain. When these events come in, certain properties on the ctx are set to indicate what kind of event it was, and the same ctx object is passed to middleware functions. The middleware can be used to set properties and to send messages to the client, and it can also choose whether the next middleware in the chain should be triggered.

For those who just want to see some code, let’s see how a basic server would look in Hexnut. First let’s install some dependencies from npm:

And then the server code:

Using wsc, a command line web socket tool, we can see how the messages are exchanged

Even as basic as this is, it should already feel like an improvement over the tangle of event listeners and global state you’d typically find. But to really begin to see the benefits, we should add some more middleware.

The next snippet is a little longer, but don’t worry — we’re gonna break it down.

Before we go into the code, I just want to point out what you’re probably already wondering, which is: Why is there a hexnut package and a hexnut-handle package?

Well, hexnut-handle is just a very simple helper package that you could easily write on your own (no seriously, check the code — it’s like 30 lines). Hexnut is designed to be a super simple core, with pluggable middleware allowing you to customise the application to tailor your specific needs. That’s exactly what hexnut-handle is — a reusable middleware helper.

Back to the code above! So the first couple of middlewares should be familiar if you’ve worked with web frameworks before. We basically add some functionality to easily send json messages to the client (lines 7–12), and also something that will automatically try to parse json messages from the client (lines 15–25).

The signature of every middleware is always the same.

Notice that middleware can be an async function, so you can easily await if needed! When you use the functions in hexnut-handle, they are also just transformed into this same signature.

From line 35 onwards, we first set up a counter at connection time, and then handle messages that increase or decrease the counter. hexnut-handle.matchMessage() should feel a little bit like setting up routing in Express/Koa, because basically that is what it is; We are defining when a message should be handled by a particular middleware, and if it doesn’t match, Hexnut just tries again with the next middleware instead.

Lastly, starting on line 61 there is one middleware that catches any message that wasn’t already handled, and in our little app that’s considered to be an “error”, and we inform the user.

Simple as it is, there are four major benefits we are getting here.

  1. Structure
  2. Separation of concerns
  3. Predictable information flow
  4. Reusability

Although everything is now in one file, we can easily separate the controllers into their own files and folders. This structure helps us to make sense of code by allowing pieces to be decomposed into chunks.

Just like you have different routes in a http web framework, for sockets we can create a similar idea by matching messages. This simple idea can be built upon further to create more sophisticated systems, but even on it’s own it separates different functionalities into their own space. This is of course important because there’s no better way to create hard to debug code than by mixing a bunch of unrelated logic in the same place!

Basically a combination of Structure and Separation of concerns. We can predict how the data will flow through our app by following where a message will be processed and what calls will be made from there.

A Middleware architecture naturally leads to code reuse, especially with utility middlewares like those above, which add general methods to the ctx object, or middlewares that pre-process messages.

Contrast this to the approach of something like socket.io, one of original and most popular solutions for Web Sockets. Socket.io puts an emphasis on creating rooms and namespaces where users connect and are able to send messages to the server and to each other (ostensibly). Communication is event driven, which means that beyond this idea of rooms the developer must add their own structure. This makes it very difficult to adhere to the principles above, when the real problem you’re trying to solve is your application logic!

None of this is to say socket.io is bad. Another large draw of socket.io is that it can progressively enhance your Socket, so if your user’s browser cannot use real Web Sockets, they can fall back to long polling in the background. This is no small feat, and the way it works is very impressive! But the web has moved on, and this kind of polyfill solution is not applicable for many users now.

So how can we go even further? Well, there is something completely unique about web sockets which is very hard to do in a normal http server, and that is creating specific protocols of message exchanges, that must happen in the right order, have the right content, and perhaps build up state in between. For that, we can use a purpose built Hexnut library called hexnut-sequence.

Let’s take a really contrived example, and imagine that we have defined an exchange that can take place between a client and a server.

  • The client sends “Hello, I am <name>
  • The server sends back “Hello <name>, how is the weather?”
  • The client sends “The weather is <weather type>
  • The server responds “OK <name>, enjoy this <weather type> day”

This is pretty simple, with only 4 messages exchanged, and no complex loops or logic. But perhaps you can see that if we want to write this in a way that adheres to four principles we described above, it might get tricky.

When you have to send and receive related but independent messages back and forth, especially when the exchange is stateful, how can you separate the concerns and preserve a predictable information flow?

Hexnut Sequence helps solves this problem. Let’s implement the example and see how it keeps the code clean:

sequence.uninterruptible(…) will create a middleware for Hexnut, just like like handle.message(…). What is different however, is that the functions you pass are generator functions instead of normal ones. We will see the reason why shortly, but notice that all the code inside the generator function (except for the yield keyword) is just normal code.

So let’s take a look at the yield keyword on line 7 — in hexnut-sequence, the yield keyword is used for a couple of things, but mainly to signal that we need to wait for a message to come in from the client. In that way, it’s a little bit like await, but it’s different because we are only going to continue running this function if the message that comes in passes the test we provided to .matchMessage().

Then we extract the name from the message and send the response to the client. After that, we match another message for the weather and send the final response.

You can do more than .matchMessage() with yield though:

  • .getMessage() will simply wait for the next message with no validation
  • .assert(condition) will cause the sequence to break if the condition is not true
  • .await(promiseReturningFunction) will await a promise, and only continue when it has resolved.

If you’re wondering what the uninterruptible thing is all about, it turns out there are 2 kinds of back and forth exchanges you can have:

  1. A sequence that starts, and then some other unrelated messages come in, but continues as soon as the next relevant message appears (interruptible)
  2. A sequence that starts, but immediately resets if an unrelated message comes in the meantime (uninterruptible)

So in the example above, you have to play out the sequence exactly. If you send something unexpected, you’ll have to start the whole thing again. In the next example, we will change this to be interruptible instead to show the difference.

If you were a gamer growing up in the 90s you might recognise the new sequence as the legendary konami code. In order to see the easter egg message, the client must send all of those messages exactly, in order, with no interruptions. However, the greeting sequence we saw before is now interruptible, so we can start by sending the name, but then send the whole konami code. When we finally go back and send the weather, the sequence will pick up exactly where it left off!

These are trivial examples, but the possibilities for real time web apps combining and utilising this kind of approach are endless!

It’s great that the backend can work on this abstraction level, but what about the frontend? Are they stuck in the low level hell? Of course not!

hexnut-client brings the same architecture to the frontend. All the same middleware will still work on the client side too (unless it’s deliberately backend specific!).

With Hexnut, the mechanics of realtime apps are no longer a concern; You can focus entirely just making an awesome app. And with a pluggable middleware approach, you can do that any way you want! Like RxJS? Install the hexnut-with-observable middleware helper and write all your middleware with the full might of observable streams! Only need simple message processing? hexnut-handle will do the trick. Structured conversations? hexnut-sequence!

And if these don’t suit your needs, write one that does and share it with the community!