The humble A/B test is a prolific, important weapon in a design team’s toolkit. But it isn’t the only one.
All too often we get comfortable in familiarity. But there are reasons why it pays to go beyond A/B testing, and not just for variety’s sake. Here’s the dirty little secret: Sometimes it might not even be the best way for you to figure out whether something works. It all depends on what you’re testing, and why.
A well-rounded design team should have an arsenal of testing methods. So let’s examine why you might not want to use A/B tests all the time—and explore some exciting alternatives.
Why relying on A/B tests can be bad
I get it. Using A/B tests to drive massive improvement is appealing, especially if you have stakeholders breathing down your neck. And the idea of building a culture of experimentation is important for business health.
But there are a couple of problems with exclusively using A/B tests, and it’s important to understand why they’re not always a good fit:
- They can take a long time. Depending on your traffic, you might not see results for weeks. That’s time you could spend testing other methods or alternatives, and an opportunity cost you need to realize.
- Most of the time, they’re based on your own beliefs. Think about what you’re testing and why. If you haven’t based your test on any sort of input—like user testing—then you’re just testing your own biases. There’s nothing necessarily wrong with that, but it’s important to recognize.
- Many people don’t know how to execute them properly. A good optimization manager is worth their weight in gold. If you have someone who’s unskilled running your program, they can make small mistakes with huge consequences. Executing the wrong types of tests, ending tests too early, and improper segmentation are huge misfires.
- You don’t test like for like. This is a huge one. You can’t provide two wildly different experiences and then pinpoint what caused the result. It could be a number of different things, and inexperienced teams running A/B tests make that mistake all too often. The result? It’s all guesswork.
- It stops you from getting stuff done. It can be way too tempting to just keep testing and never actually deploy anything. Speed is everything, and waiting until you’re absolutely sure about something—which is a trap UX researchers often fall into—can lead to lost gains.
So what else is there? It depends on your circumstances, your designs, and your business needs, but there are effective alternatives you can put to good use.
Beta environment testing
One of the disadvantages with an A/B test is that you can’t really test wildly different experiences. The more you do, the more you risk mucking up your analysis later on. Beta environments provide a great opportunity to go a little wild.“Understand what you’re testing and why—and what results you’re after.”
For designers and copywriters, it’s a dream come true. You can go nuts creating the design and language you really want without worrying about a like-for-like test.
Users are also far more forgiving if you tell them up front that it’s a beta environment, and give them the chance to opt in. Plus, you can explicitly ask for feedback in a way you can’t with other tests: Fill out this form, let us know about any bugs, etc.
Sure, there are technical limitations with beta tests. They’re not for everyone. But if you can experiment with this method, you should.
Multi-armed bandit tests
If you haven’t heard of these before, they can sometimes take a minute to understand. Forget testing two or more different versions of a particular page and giving equal traffic to all of them. Instead, take a huge variety of challengers, then make sure the winner is always given the most traffic.
Related: How Netflix does A/B testing
It’s a smarter version of an A/B test. You’re not losing time on initiatives that don’t work, and you’re prioritizing efforts on the highest-achieving challengers. You’re working faster and more efficiently, exploring and learning as you go. You don’t need to wait several weeks to declare a challenger.
Sounds great, right? It is. Except for the one downside: technical complexity. Sure, there are ways you can run a bandit test through optimization services, but they’re generally pretty complex. You need someone who really knows what they’re doing.
Again, it’s important to understand what does and doesn’t work for these types of tests: Widely different experiences are not the way to go. These are more beneficial for things like different article headlines and CTAs.
Many people don’t think of usability testing as something you can do instead of A/B testing. But you can.
Instead of doing a bunch of testing, then crafting designs to put in an A/B test, simply extend out your research. Hire more users, then do the research in waves. You’re not going to find out which test technically performs better, but you’re going to get more specific feedback on each variation.
Related: A quick guide to design research
Is it a like-for-like replacement? No. And we all know designs that test well don’t necessarily perform well. But if you’re hamstrung or want more specific feedback, this is the way to go.
The lesser cousin of user testing, polling can still give you some useful information. When users leave your new experience, simply follow up. What did you like? What didn’t you like? How would you rate it on a scale of one to 10?
Obviously, there are problems with this: Selection bias might mean people who disliked the experience most are more likely to comment. It’s also not scientific.
But if you combine it with some other methods such as screen capture and heat maps, it can provide guidance for your next design pass.
When should you really use A/B testing?
After all this, it might seem like A/B testing doesn’t have its place. Not true. It’s effective when you’re making significant changes to the design of a site; when you’re changing prices; or when you want to see how small, specific changes can alter your conversion rates.
You just need to be insightful, decisive, and deliberate. Understand what you’re testing and why, and what results you’re after. Once you have those in mind, you can pick the testing method that will most accurately measure the results you want.