In 2018, every organization has a data strategy. But what makes a great one?
We all know what failure looks like. Resources are invested, teams are formed, time goes by — but nothing comes of it. No one can necessarily say why; it’s always Someone Else’s Fault.
It’s harder to tell the difference between a modest success and excellence. Indeed, in data science they can they look very similar for perhaps a year. After several years, though, an excellent strategy will yield orders of magnitude more valuable results.
Both mediocre and excellent strategies begin with a series of experiments and investments leading to data projects. After a few years, some of these projects work out and are on their way to production.
In the mediocre strategy, one or two of these projects may even have a clear ROI for the business. Typically, these projects will be some kind of automation for cost savings, or applying machine learning to an existing process to improve its efficiency or performance. This looks a lot like success, and it may suffice, but it’s missing out on the unique advantages of an excellent data strategy.
In an excellent strategy, more data projects have worked out, and they were surprisingly cost-effective to develop. Further, the process of building the first few projects inspires new project ideas. In an excellent strategy, the projects will include automation and efficiency and performance improvements, but they will also include projects and ideas for new revenue generation and entirely new businesses driven by your unique data assets. The data teams work well together, build on each other’s work, and collaborate smoothly with their business partners. There’s a clear vision of what the machine-learning driven future of the business can look like, and everyone is working together to achieve it.
Building an Excellent Data Strategy
Crafting a data strategy requires many parties at the table, including data experts, technology leadership, and business and subject-matter experts. It also requires leadership support that goes beyond just wanting to check off a “machine learning” box.
Here’s how most companies decide which data projects to pursue, which alone is a recipe for the mediocre data strategy. Management identifies a set of projects it would like to see built and creates the ubiquitous prioritization scatterplot: one axis represents a given project’s value to the business and the other axis represents its estimated complexity or cost of development. Each project is given a spot on the chart, and management allocates the company’s limited resources to the projects that they believe will cost the least and have the highest business value.
This is not wrong, but it is also not optimal. An excellent data strategy moves beyond a straightforward evaluation of each project in isolation to consider a few additional dimensions.
First, an excellent data strategy includes a well-coordinated organizational core. It’s built on a centralized technology investment and well-selected and coordinated defaults for the architecture of data applications. This centralization of defaults allows for each application to make different decisions if necessary while maintaining maximum compatibility across the organization and flexibility over time by default.
For example, one global media company I worked with had grown dramatically through acquisitions. Each business line had a different technology stack and independent IT group, leading to challenges integrating data that already existed, and different architectures for all future investments. Centralizing this practice was key to their ongoing success.
Second, an excellent data strategy is specific in the short term and flexible in the long term. We know quite a lot about what the machine learning capabilities of tomorrow look like, but less about what the capabilities of next year will look like. We can only guess what will be possible in five years. Similarly, the business landscape is transforming, leading to new competition and new opportunities. Organizations that engage in five-year planning cycles will miss the opportunities that emerge in the meantime. An excellent strategy is one that is adaptable and considered to be a living document.
The best strategies are strong in directional conviction, but flexible in the details. You want to know where you want to end up, but not necessarily pre-define each step you need to take to get there.
Finally, an excellent data strategy takes into account one key insight: data science projects are not independent from one another. With each completed project, successful or not, you create a foundation to build later projects more easily and at lower cost.
Choosing Between Data Science Projects
Here’s what project selection looks like in a firm with an excellent data strategy: First, the company collects ideas. This effort should be spread as broadly as possible across the organization, at all levels. If you only see good and obvious ideas on your list, worry — that’s a sign that you are missing out on creative thinking. Once you have a large list, filter by the technical plausibility of an idea. Then, create the scatterplot described above, which evaluates each project on its relative cost/complexity and value to the business.
Now it gets interesting. On your scatterplot, draw lines between potentially related projects. These connections exist where projects share data resources; or where one project may enable data collection helpful to another project; or where foundational work on one project is also foundational work on another. This approach acknowledges the realities of working on such projects, like the fact that building a precursor project makes successor projects faster and easier (even if the precursor fails). The costs of gathering data and building shared components are amortized across projects.
This approach makes higher-value projects — those that would perhaps have seemed too ambitious — look less like an aggressive, expensive push forward. Instead, it reveals that such projects may indeed be more efficient and safer to proceed with than other lower-value projects that looked attractive in a naive analysis.
Put differently, an excellent data strategy acknowledges that projects play off of one another, and that the costs of projects change over time in light of other projects undertaken (and new technology, as well). This allows more accurate planning and may expand the organization’s capabilities more than expected. You can revisit this planning process quarterly, which is in line with how quickly machine learning technologies are changing.
We’re currently at a moment in the development of machine learning, AI, and data where the technology isn’t commoditized and it’s not entirely obvious where to invest. Companies with excellent data strategies will be more likely to choose well.