Pseudo Localization @ Netflix

By Netflix Technology Blog

by Tim Brandall

In the past 8 years Netflix has transformed from an English-only product, to now supporting 26 languages and growing. As we add language support for our members residing in 190 different countries, scaling globalization at Netflix has never been more important. Along the way we’ve built out countless solutions to help us achieve globalization at scale. In this article we’ll focus on one my team has been working on, Pseudo Localization.

The problem we set out to solve was simple: Expansion of text due to translation causes most of the UI layout issues we detect during localization testing.

When translating into other languages, the translated text could be up to 40% longer than the English. This is more prevalent in German, Hebrew, Polish, Finnish, and Portuguese. Take this real-world example from our German UI :

Don’t miss out.

Translated into German, this becomes much longer! :

Lassen Sie sich nichts entgehen!

Put in the context of the UI, we see problems:

This is one example among many. The source of the problem is that our UIs are designed in English, and assume English string lengths, line heights, and glyphs. More often than not, when those strings are translated we will see expansion that causes layout issues. When the product is translated into 26 languages, like ours is, you potentially end up with 26 defects that need to be logged, managed, and resolved; all the while we had the opportunity to fix the issue at the English design phase, before translation ever started.

Enter Pseudo Localization. Pseudo Localization is a way to simulate translation of English UI strings, without waiting for, or going to the effort of doing real translation. Think of it as a fake translation that remains readable to an English speaking developer, and allows them to test for translation related expansion, among other important things. Here’s an example of our Netflix Pseudo Localization in action on iOS:

It helps to break down what we’re doing here, using the following string as an example:

Find Help Online

After passing through our Pseudo Localization algorithm it becomes this:

[ƒîกี้ð Ĥéļþ Öกี้ļîกี้é one two]

Here are the various elements of the transform:

  • Start and end markers: All strings are encapsulated in [ ]. If a developer doesn’t see these characters they know the string has been clipped by an inflexible UI element.
  • Transformation of ASCII characters to extended character equivalents: Stresses the UI from a vertical line height perspective, tests font and encoding support, and weeds out strings that haven’t been externalized correctly (they will not have the Pseudo Localization applied to them).
  • Padding text: Simulates translation induced expansion. In our case we add “one two three four”…etc after each string, simulating 40% expansion. Note that we don’t apply expansion to areas of the UI where text length has already been limited by other systems prior to display on the UI, doing so would cause false positives ( e.g. synopsis text, titles, etc ).
Pseudo Localization on our TV platform

We are lucky enough to leverage our cloud based Global String Repository, for this effort. We apply the Pseudo Localization transformations on any string requested from the repository, before it is handed off to the client. The transformation logic resides in our Netflix Internationalization library, NFi18n, and is available by API to all other Netflix services. The beauty of this solution is that it can be applied to all of our supported UIs.

One of the biggest challenges we faced was text that displays on our UIs but doesn’t originate from our Global Strings Repository. Examples of this type of text would be movie/show titles, synopsis, cast names, maturity ratings, to name a few. This “movie metadata” lives in various systems across Netflix so we had some investigation to do to figure out where, and when was the best place and time to apply our Pseudo Localization transforms on this metadata. Having all this additional text pseudo localized was important because without the pseudo localized metadata, the experience felt incomplete, half pseudo localized, half English:

Going into this project, we knew that driving implementation across the various UI development teams at Netflix would be critical to success. It didn’t matter how good our solution was if nobody was using it. We put a lot of investment into the following areas:

  • Training and education: everyone had to know what it was, and why we wanted them to use it. We had to demonstrate impact.
  • Ease of use: we had to make it as easy as possible for development teams to integrate with our solution. Adding any type of friction wasn’t an option.
  • Opt-out: we wanted the Netflix Pseudo Localization service to be opt-out. What this meant was Pseudo Localization was the default development language for all UI developers, when they run their debug builds, they see Pseudo Localization. We were effectively asking every UI developer to fundamentally change the way they work.

We correctly assumed that architecting, and implementing the solution would not be half of the battle, I would argue it was even less. The real work starts while advocating, influencing, educating, and convincing development teams to fundamentally change the way they work. We did this by showing the impact that Pseudo Localization can have and the amounts of defects it can eradicate, that they fix a UI layout issue once, instead of 26 times. Already we are seeing UI engineers catching and fixing UI layout issues that previously we would have caught post-translation in multiple languages. Now they can simply find and fix it once themselves.

About 6 weeks after the initial rollout to development teams, we surveyed all our developers. Areas we touched on were:

  • What would you like to see improved?
  • If you turned it off, why?
  • Do you understand the value of Pseudo Localization?
  • How can we make it easier for you?
  • Have you caught layout issues using Pseudo Localization?

Based on the survey, the resounding theme was readability. While we had maintained readability, we had introduced additional overhead in parsing the text on screen while reading. Because of this we will look to simplify our transforms to something less egregious, while still retaining the useful elements of Pseudo Localization. We also heard feedback about the expansion text we add, “one two three four…”, it feels unnatural and can cause confusion around if the text is expansion text, or if it’s a placeholder / variable in the UI. As a result we will investigate other ways of simulating expansion, one option is to multiply vowels to achieve the same result, e.g.:

Before: [ƒîกี้ð Ĥéļþ Öกี้ļîกี้é one two]

After: [ƒîîîกี้ð Ĥéééļþ ÖÖกี้ļîîîกี้ééé]

If you’re interested in working on high impact projects like the one we’ve talked about here, I have openings on my Internationalization team. Check out the role we have posted, and feel free to connect with me on LinkedIn!