The best product managers make the right decisions about what features to build in order to launch a successful product. These decisions are difficult to make, in part because it’s hard to predict what features users want. It’s impossible to know with 100% certainty if users will adopt a given feature once it’s released, let alone if they’ll continue using it 5 years from now.
Product managers gain insights and data about what users have done in the past to help them predict what users will do in the future. This practice commonly takes the form of surveys, interviews, and in-product analytics. If you survey 50 users and find that 40 users have a problem finding restaurants, you’ve reduced the risk that building a restaurant discovery app will be a waste of resources.
But the challenge is, as the old investing adage goes, “past performance is no guarantee of future results.” In product terms, this means that simply because a group of users responded positively to a survey last month, it doesn’t mean a large group of users will be delighted next year.
Therefore, product managers need to strike a balance between 2 methods for making product decisions: intuition and data. If you over-rely on data from the past, you hinder your ability to adapt to insights you gain in the future. If you over-rely on intuition, you miss out on critical data about your users that is likely to have some predictive power about the future.
The concept of “overfitting” provides a useful heuristic for striking balance between intuition and data and determining how to continuously iterate on your product.
What is overfitting?
I learned about the concept of overfitting from a book called Algorithms to Live By: The Computer Science of Human Decisions by Brian Christian and Tom Griffiths, which applies computer science concepts to everyday decision-making. Jon Vars, Chief Product Officer at Varo Money and former Chief Product Officer at TaskRabbit, recommended the book when I interviewed him on This is Product Management.
Overfitting is a statistical concept that’s often applied to machine learning and investing, but it’s also a useful decision-making model for product managers. Overfitting occurs when a model is excessively complex, or contains too many parameters relative to the amount of data. An overfitted model corresponds too closely to a particular set of data, and may therefore fail to fit additional data or predict the future reliably because it over-relies on the underlying data and overreacts to minor fluctuations in the data.
“The product needs to fit the data you have, while still adapting to the data you haven’t seen yet.”
Overfitting is likely to be a problem under conditions that product managers are quite familiar with. According to Christian and Griffiths, overfitting poses a risk when under 2 conditions:
- The odds of estimating correctly are low
- The weight that the model puts on the data is high
These conditions are almost always present in product management. There’s never enough data to make a decision and the chances of building a successful product are inherently low. Christian and Griffiths elaborate:
“When you’re truly in the dark, the best-laid plans will be the simplest. When our expectations are uncertain and the data are noisy, the best bet is to paint with a broad brush, to think in broad strokes.”
When you test product and feature ideas with a small sample size, such as surveying 50 users per the example above, it’s likely that the results won’t perfectly represent a larger population of users and won’t necessarily predict users’ behavior into the distant future. The product needs to fit the data you have, while still adapting to the data you haven’t seen yet. So, you don’t want to build and launch a robust product or high-fidelity prototype based on a limited amount of data.
Counterintuitively, however, the solution to the problem of overfitting is not simply to use less data to make decisions. Just because the limited sample size cannot perfectly predict the future, doesn’t mean it’s useless. If you over-rely on intuition instead of data, you run the risk of “underfitting.”
What is underfitting?
Underfitting occurs when a model cannot capture the underlying trend of data. In product terms, this means over-relying on intuition instead of running experiments to determine what users want, and making decisions accordingly. For example, if your data that shows that 80% of users prefer feature set A over feature set B, but you build feature set B anyways, you are underfitting. While your data may not be statistically significant, it’s still informative data.
“If your ‘model’ (product) can’t accurately represent the past, there isn’t an indication that it will accurately represent the future.”
Prior to the proliferation of experiment-driven product management, lead by Steve Blank, Bob Dorf, and Eric Ries, intuition was the primary way that product managers made decisions. They relied on large-scale, upfront studies that became outdated by the time they made decisions about products and features.
The challenge with this approach is that if your “model” (product) can’t accurately represent the past, there isn’t an indication that it will accurately represent the future. In order to accurately represent the past, you need to gain insights through surveys, interviews, and prototype testing throughout the product life cycle, and use the data from these experiments to inform your decisions.
Lessons for product managers
If you over-rely on data, you run the risk of overfitting, and failing to predict what users want. If you under-rely on data, you run the risk of underfitting, and building something without knowledge of what your users actually want.
“If you don’t have any data, you’re flying blind.”
How do you find the right balance? My company, Alpha, has an integrated experimentation platform that a quarter of the Fortune 100 use to run 600+ tests per month. They’ve found that continuous iteration is a powerful solution. More specifically, they’ve discovered the benefits of increasing the number of experiments they run instead of simply increasing the sample size of each experiment. Below are 6 action steps to find the right balance between overfitting and underfitting and incorporating iteration into your product development.
1. Start testing
Avoid underfitting by getting data about what your users want. If you don’t have any data, you’re flying blind. Surveys, split tests, and prototypes can help. Don’t make a big bet before getting some data to give you confidence it’s the right bet.
Related: How to get started with user testing
2. Limit your forecasting
If you over-rely on the data you get in the early stages of testing, you run the risk over overfitting. If create a product roadmap for 5 years out based on a survey of 50 users, you’re probably overfitting. Instead, consider limiting your roadmap to a few months out, and continually iterating as you gain additional user insights.
3. Don’t do everything users say
When one user tells you they want a feature that will automatically call their mother on her birthday, but the other 49 users you talk to tell you they don’t want that feature, it’s likely that the single user is “noise.” It’s an idiosyncrasy that doesn’t represent the greater user base. There is almost always noise in the data when conducting user research. If you were to build this feature, you would be overfitting.
4. Live by data, don’t die by data
In an ideal world, you could survey 10,000 users and know with certainty what products and features will be successful. In reality, there is almost some degree of error or noise within the data. Building a product that aligns too closely with any single experiment is risky. Maintain humility and have realistic expectations about what data can and can’t do for you.
5. Iterate continuously
It’s important to test a model against data which is outside of the sample used to develop it. In product terms, your goal is not simply to serve the 10 users who participated in your customer discovery survey. Your goal is to serve a large market. Therefore, it’s critical to continuously run experiments throughout the product life cycle. Increase the fidelity of your product and the scope of your experiments as your previous experiments give you a more accurate model. For example, start with a survey, then build a prototype, then launch a product. Identify patterns in the data and use your intuition to determine what to test next.
“Continuously running experiments throughout the product life cycle is critical.”
6. Think big
A large-market research study will tell you the size of the market for in-store DVD rentals. It won’t tell you about the demand for subscriptions for unlimited streaming online. While the data might tell you to build a Blockbuster competitor, it won’t show you the future of entertainment. New technologies are reshaping almost every industry and customer needs are constantly evolving.
Have a vision for the future and dig into meaningful customer pain points instead of simply focusing on what you can measure.