High-performance input handling on the web


There is a class of UI performance problems that arise from the following situation: An input event is firing faster than the browser can paint frames.

Several events can fit this description:

  • scroll
  • wheel
  • mousemove
  • touchmove
  • pointermove
  • etc.

Intuitively, it makes sense why this would happen. A user can jiggle their mouse and deliver precise x/y updates faster than the browser can paint frames, especially if the UI thread is busy and thus the framerate is being throttled (also known as “jank”).

Screenshot of Chrome Dev Tools showing that a long frame of 546ms can contain as many as four pointermove events

In the above screenshot, pointermove events are firing faster than the framerate can keep up.[1] This can also happen for scroll events, touch events, etc.

The performance problem occurs when the developer naïvely chooses to handle the input directly:

element.addEventListener('pointermove', () => { doExpensiveOperation()
})

In a previous post, I discussed Lodash’s debounce and throttle functions, which I find very useful for these kinds of situations. Recently however, I found a pattern I like even better, so I want to discuss that here.

Understanding the event loop

Let’s take a step back. What exactly are we trying to achieve here? Well, we want the browser to do only the work necessary to paint the frames that it’s able to paint. For instance, in the case of a pointermove event, we may want to update the x/y coordinates of an element rendered to the DOM.

The problem with Lodash’s throttle()/debounce() is that we would have to choose an arbitrary delay (e.g. 20 milliseconds or 50 milliseconds), which may end up being faster or slower than the browser is actually able to paint, depending on the device and browser. So really, we want to throttle to requestAnimationFrame():

element.addEventListener('pointermove', () => { requestAnimationFrame(doExpensiveOperation)
})

With the above code, we are at least aligning our work with the browser’s event loop, i.e. firing right before style and layout are calculated.

However, even this is not really ideal. Imagine that a pointermove event fires three times for every frame. In that case, we will essentially do three times the necessary work on every frame:

Chrome Dev Tools screenshot showing an 82 millisecond frame where there are three pointermove events queued by requestAnimationFrame inside of the frame

This may be harmless if the code is fast enough, or if it’s only writing to the DOM. However, if it’s both writing to and reading from the DOM, then we will end up with the classic layout thrashing scenario,[2] and our rAF-based solution is actually no better than handling the input directly, because we recalculate the style and layout for every pointermove event.

Chrome Dev Tools screenshot of layout thrashing, showing two pointermove events with large Layout blocks and the text "Forced reflow is a likely performance bottleneck"

Note the style and layout recalculations in the purple blocks, which Chrome marks with a red triangle and a warning about “forced reflow.”

Throttling based on framerate

Again, let’s take a step back and figure out what we’re trying to do. If the user is dragging their finger across the screen, and pointermove fires 3 times for every frame, then we actually don’t care about the first and second events. We only care about the third one, because that’s the one we need to paint.

So let’s only run the final callback before each requestAnimationFrame. This pattern will work nicely:

function throttleRAF () { return () => { let queuedCallback return callback => { if (!queuedCallback) { requestAnimationFrame(() => { const cb = queuedCallback queuedCallback = null cb() }) } queuedCallback = callback } }
}

We could also use cancelAnimationFrame for this, but I prefer the above solution because it’s calling fewer DOM APIs. (It only calls requestAnimationFrame() once per frame.)

This is nice, but at this point we can still optimize it further. Recall that we want to avoid layout thrashing, which means we want to batch all of our reads and writes to avoid unnecessary recalculations.

In “Accurately measuring layout on the web”, I explore some patterns for queuing a timer to fire after style and layout are calculated. Since writing that post, a new web standard called requestPostAnimationFrame has been proposed, and it fits the bill nicely. There is also a good polyfill called afterframe.

To best align our DOM updates with the browser’s event loop, we want to follow these simple rules:

  1. DOM writes go in requestAnimationFrame().
  2. DOM reads go in requestPostAnimationFrame().

The reason this works is because we write to the DOM right before the browser will need to calculate style and layout (in rAF), and then we read from the DOM once the calculations have been made and the DOM is “clean” (in rPAF).

If we do this correctly, then we shouldn’t see any warnings in the Chrome Dev Tools about “forced reflow” (i.e. a forced style/layout outside of the browser’s normal event loop). Instead, all layout calculations should happen during the regular event loop cycle.

Chrome Dev Tools screenshot showing one pointermove per frame and large layout blocks with no "forced reflow" warning

In the Chrome Dev Tools, you can tell the difference between a forced layout (or “reflow”) and a normal one because of the red triangle (and warning) on the purple style/layout blocks. Note that above, there are no warnings.

To accomplish this, let’s make our throttler more generic, and create one that can handle requestPostAnimationFrame as well:

function throttle (timer) { return () => { let queuedCallback return callback => { if (!queuedCallback) { timer(() => { const cb = queuedCallback queuedCallback = null cb() }) } queuedCallback = callback } }
}

Then we can create multiple throttlers based on whether we’re doing DOM reads or writes:[3]

const throttledWrite = throttle(requestAnimationFrame)
const throttledRead = throttle(requestPostAnimationFrame) element.addEventListener('pointermove', e => { throttledWrite(() => { doWrite(e) }) throttledRead(() => { doRead(e) })
})

Effectively, we have implemented something like fastdom, but using only requestAnimationFrame and requestPostAnimationFrame!

Pointer event pitfalls

The last piece of the puzzle (at least for me, while implementing a UI like this), was to avoid the pointer events polyfill. I found that, even after implementing all the above performance improvements, my UI was still janky in Firefox for Android.

After some digging with WebIDE, I found that Firefox for Android currently does not support Pointer Events, and instead only supports Touch Events. (This is similar to the current version of iOS Safari.) After profiling, I found that the polyfill itself was taking up a lot of my frame budget.

Screenshot of Firefox WebIDE showing a lot of time spent in pointer-events polyfill

So instead, I switched to handling pointer/mouse/touch events myself. Hopefully in the near future this won’t be necessary, and all browsers will support Pointer Events! We’re already close.

Here is the before-and-after of my UI, using Firefox on a Nexus 5:

When handling very performance-sensitive scenarios, like a UI that should respond to every pointermove event, it’s important to reduce the amount of work done on each frame. I’m sure that this polyfill is useful in other situations, but in my case, it was just adding too much overhead.

One other optimization I made was to delay updates to the store (which trigger some extra JavaScript computations) until the user’s drag had completed, instead of on every drag event. The end result is that, even on a resource-constrained device like the Nexus 5, the UI can actually keep up with the user’s finger!

Conclusion

I hope this blog post was helpful for anyone handling scroll, touchmove, pointermove, or similar input events. Thinking in terms of how I’d like to align my work with the browser’s event loop (using requestAnimationFrame and requestPostAnimationFrame) was useful for me.

Note that I’m not saying to never use Lodash’s throttle or debounce. I use them all the time! Sometimes it makes sense to just let a timer fire every n milliseconds – e.g. when debouncing window resize events. In other cases, I like using requestIdleCallback – for instance, when updating a non-critical part of the UI based on user input, like a “number of characters remaining” counter when typing into a text box.

In general, though, I hope that once requestPostAnimationFrame makes its way into browsers, web developers will start to think more purposefully about how they do UI updates, leading to fewer instances of layout thrashing. fastdom was written in 2013, and yet its lessons still apply today. Hopefully when rPAF lands, it will be much easier to use this pattern and reduce the impact of layout thrashing on web performance.

Footnotes

1. In the Pointer Events Level 2 spec, it says that pointermove events “may be coalesced or aligned to animation frame callbacks based on UA decision.” So hypothetically, a browser could throttle pointermove to fire only once per rAF (and if you need precise x/y events, e.g. for a drawing app, you can use getCoalescedEvents()). It’s not clear to me, though, that any browser actually does this. In any case, throttling the events to rAF in JavaScript accomplishes the same thing, regardless of UA behavior.

2. Technically, the only DOM reads that matter in the case of layout thrashing are DOM APIs that force style/layout, e.g. getBoundingClientRect() and offsetLeft. If you’re just calling getAttribute() or classList.contains(), then you’re not going to trigger style/layout recalculations.

3. Note that if you have different parts of the code that are doing separate reads/writes, then each one will need its own throttler function. Otherwise one throttler could cancel the other one out. This can be a bit tricky to get right, although to be fair the same footgun exists with Lodash’s debounce/throttle.