How We Build a Multimillion Dollar AI for Indonesia

By Ibrahim Arief

September 2018 marks the two year anniversary of our first AI product at Bukalapak: a recommendation AI that learns from the behavioral pattern of our visitors and provides suggestions to help them find the right seller and product for their purchase.

Any potterhead here?

Over the course of two years, our Reco AI has assisted our sellers in generating millions of additional transactions that would not be possible otherwise. That said, when talking about the positive impact for the AI, our proudest metric is that it has been helping hundreds of thousands of Indonesians in earning an additional 1 trillion IDR (US$70 million) of income. If there is one thing we love to do at Bukalapak, it is to improve the lives of Indonesians with our products.

Every hero requires an origin story, and our AI is no exception. The origin story for our AI starts on the day I moved back to Indonesia. Having just dropped out of my Ph.D. in computer vision a couple of years before, I was eager to start a culture of AI research within Bukalapak and see what kind of AI we can build from our mountains of data.

Wait, did I say “mountains of data”?

It turns out, one of the core principle at Bukalapak since day one is “store everything, we’ll be able to build something useful with those data someday.” We have more than a petabyte of data on our data warehouse by the time I joined, ranging from transactions to product descriptions to user-generated traffic. The last one turns out to be our gold mine.

After talking with several engineers, data scientists, and product managers, we formulate a strategy to quickly experiment and validate the value of using our tons of traffic data to build an AI that can learn from the traffic and provide product recommendations to potential buyers.

We scoped in a small project to build a working prototype within a couple of months to one of the product squad, a small team of five engineers and two data scientists. The short cycle is intentional since we do not want to spend more efforts than that if the endeavor turns out to be fruitless.

The team behind our first reco AI

What we built back then was quite rudimentary compared to the complex model we have right now. We started with one of the most common types of recommendation AIs: collaborative filtering.

The idea is deceptively simple: if a user viewed product X and then viewed product Y, then another user viewing product X might also be interested with product Y. We are effectively using the traffic data to form associations between our millions of products. Multiply this approach with billions of “viewed together” events, and what started as a simple thought exercise becomes a powerful training dataset for the AI that we want to build.

Moreover, this approach is highly parallel and highly scalable and thus becomes the perfect use case for MapReduce computations. We use Spark MapReduce running on a 32-core machine for our initial prototype, crunching 4 GB of compressed traffic snapshot in several hours.

One crucial context: prior to the AI, we use Elastic’s “more like this” feature to give our visitors a product-to-product recommendation. We took a couple of samples and visually compare the output from Elastic and our AI:

The samples show good quality recommendations. Instead of showing similar products, we present our visitors with various product alternatives for their shopping journey. So far so good, the next step is to validate the AI prototype in production.

We decided to put the recommendation at the bottom slot of every product detail page, as shown below.

We use an approach called A/B testing to validate whether the AI-based recommendation that we put on that slot gives more impact than the Elastic-based similar products. The idea is to have both the old and new versions running at the same time, but we randomly choose which version to present when a user visits our site. We then track the user’s journey within our site, taking careful note of the relevant metrics. (e.g. whether the user made a purchase or not)

In our case, when a user visit a product page, the recommendations at the bottom of the page are being populated from either the reco AI or from Elastic. We then use our internally built A/B testing framework to track multiple metrics related with purchases to gather real-world objective data regarding the performance of the AI.

Sample of the results from our internally built A/B test service, sorry for the huge chunk of censorship!

We ran the A/B test for two weeks before analyzing the outcome of the experiment. The result was astounding. We observed a significant increase in visitor purchase when they were being presented with recommendations coming from the new AI.

Convinced with the positive impact of the AI, we decided to spend another month building automated data pipelines and process to incrementally train the AI and build fresh recommendations every day for our visitors. At the end of the first month alone, we saw that the AI had generated an additional US$1 million of revenue for our sellers.

Fast forward two years later, we have built a dedicated team that focus on increasing the quality of our recommendation AI. We have expanded the computing cluster dedicated to build our recommendations to a thousand-core cluster and we are now able to generate the reco-set in a matter of minutes instead of hours, enabling us to do more rapid experimentation with our models.

We have made several iterative improvements to our AI models, always validating the idea with further A/B tests, and continue to experiment with more ideas as we go on. Several of the “big wins” in the past include expanding our event models and adding personalization context to our recommendations. Thanks to these improvements, our reco AI is now bringing home 75 billion IDR (US$5 million) per month in additional income for sellers all over Indonesia.

For future improvements, we are currently experimenting with modeling the recommendations in a dense multi-million node knowledge graph, and see whether we can coax more insightful recommendations for our visitors based on the connections within the graph.

Reco AI was the first AI product that we built, but since that time two years ago, having seen the kind of potential impact an AI could do to help our millions of users, we want to establish an AI division to expand our focus and gain expertise at other areas where we could use AIs in order to assist our visitors, sellers, and operational teams.

Building up an AI division is a lengthy subject that warrants a separate article, but a great deal of inspiration for the way we organize our AI efforts came from the following talk with Dani Yogatama, who shared with us about his experience being part of large research organizations at Google DeepMind and Carnegie Mellon University.

What we have established at Bukalapak so far consist of two streams of research efforts. The first stream is for building regression or neural-net based AI models from millions of rows of structured and labeled data. We use an AutoML platform to rapidly build basic models to test out the feasibility of using AI to approach the problem. This platform enables our data scientists to quickly iterate and validate their models, typically within a couple of weeks instead of months.

The second stream is for building more complex AI models. We establish several AI R&D teams to build and improve various AI services ranging from using computer vision to recognize objects/watermarks within product images, to using natural language processing to understand search queries better and assist our users in finding the right product for their shopping journey.

Sorry for the pun. We have been asked about this numerous times by others within the industry ever since word got around that we are building a high impact AI to help our visitors and sellers. Here are three recommendations from us if you want to build AIs for your particular industry/business:

  1. Collect, store, and verify that you have the data. Starting the project with an already organized petabyte-scale data warehouse helps saves months within the project. Since otherwise, we would need to spend several months just to gather and clean up the dataset necessary to train our AI.
  2. Initiate, validate, iterate. Don’t get stuck on trying to build the perfect AI or trying to strictly follow some research paper within the field. Sometimes starting with a simple model will be enough as long as you can quickly validate the model against real-world traffic or data. Then you can iterate and build progressively more complex and better quality models.
  3. A/B test, A/B test, A/B test. Oh and did we say you should A/B test your AI? Running manual analysis using a small number of samples are fine for quick and early validations, but don’t forget to validate your model using real-world data and A/B tests, doing so will help you avoid possible selection bias and gather objective data to guide your AI development.

We have built one of the best tech workplaces in Indonesia, where we combine strong learning and research culture, mutual respect, work-life balance, and the capacity to improve and bring positive impact to millions of Indonesians.

We have more than 850 tech talents as of September 2018, working together to build exciting tech products to help improve the livelihood of Indonesians, and we are hiring many more to join our growing tech family. Sounds interesting? Check out our career site, we have tons of interesting roles!