A/B testing is great when it works. But it doesn’t always work. Maybe your site gets 300 visitors a month and you’d need three years to reach a confident result. Maybe you’re trying to figure out why people leave, not just which button they click. Or maybe you’re spending money on ads and need to know if the whole campaign matters. Not just which headline converts 0.3% better.
Good news: there are solid alternatives to A/B testing for every one of those situations. Some need fewer visitors. Some answer different questions entirely. And you’re probably already doing a few of them without realizing it.
Here are seven methods worth knowing, when to use each one, and a simple framework for picking the right approach.
When you actually need an alternative to A/B testing
A/B testing has one big requirement: traffic. Lots of it.
A 2025 survey of 402 marketers by Ascend2 found that 51% say limited traffic is their number-one A/B testing challenge. That’s not a small-company problem. That’s the most common problem across the board.
Mohit Agrawal, who built Wealthfront’s growth engineering team, did the math. A startup with around 500 monthly visitors and a 2% conversion rate would need 1,254 days to detect a 5% improvement. That’s nearly three and a half years for one test.
And even when companies do have the traffic, A/B testing results are humbling. An analysis of 1,001 real A/B tests found that only 33.5% produced a statistically significant positive result. At Microsoft, one in three experiments actually improved the metric it was designed to improve. At Google, the number was closer to one in ten.
None of this means A/B testing is bad. When you can run it properly, following A/B testing best practices produces compounding results. For product teams, our product A/B testing guide covers the PM-specific workflow. It just means A/B testing is one tool. And sometimes you need a different one.
You probably need an alternative when:
- Your site gets fewer than 1,000 visitors a month. You’ll wait forever for a confident result. Check your sample size requirements to see exactly how long.
- You need the “why,” not just the “what.” A/B tests tell you which version won. They don’t tell you why visitors preferred it.
- You’re evaluating whole campaigns or channels. Should you be spending on Facebook ads at all? A/B testing a headline won’t answer that.
- You’re in the early design stage. Testing two live pages assumes you already have two solid options. Sometimes you need to figure out what to build first.
- You want to deploy safely. Releasing new code without breaking things is a deployment problem, not a testing problem.
Our take: Most small businesses don’t need more testing. They need the right kind of testing. If your site gets under 1,000 monthly visitors, skip A/B testing for now. Start with usability testing and customer interviews. You’ll learn more in a week than a year of waiting for statistical confidence.
Usability testing: find out why people leave
Usability testing means watching real people try to do something on your site and seeing where they get stuck. Buy a product, sign up, find a page. Any task. It’s the opposite of A/B testing in almost every way.
A/B testing answers: “Which version converts better?” Usability testing answers: “Why are people struggling?”
The numbers are surprisingly forgiving. Research by Nielsen and Landauer (1993) found that five usability test participants uncover about 85% of all usability problems. Five people. Not five thousand.
Picture this: you ask five friends to order something from your website. You watch over their shoulder. Within an hour, you’d spot every major friction point. That’s usability testing.
When to choose usability testing over A/B testing
- You need to understand why people are bouncing, not just which page bounces less
- You’re in the early design phase and don’t have a live page yet
- Your traffic is too low for reliable A/B test results
- You want to generate hypotheses (ideas for what to A/B test later)
When to stick with A/B testing
- You have two polished design options and enough traffic to compare them
- You need hard numbers to present to stakeholders
- You’re optimizing something specific, like your conversion rate, and want proof
- You’re validating UX changes — UX-focused A/B testing has its own set of considerations around what to measure and how to frame hypotheses
The best approach combines both. Use usability testing to discover what’s broken. Then fix it and A/B test the fix to prove it works. That way you’re not guessing what to test, and you’re not waiting months for a test to finish.
Jakob Nielsen from NN/g has pointed out that A/B testing creates a “near-term view.” It misses bigger structural problems. A/B tests improve what exists. Usability tests question whether what exists is the right thing at all.
Preference testing: which design feels better?
Preference testing means showing people two or more design options and asking which one they prefer. (Some call it desirability testing.) It measures first impressions and gut reactions.
It sounds similar to A/B testing, but it’s solving a different problem at a different stage.
| Preference testing | A/B testing | |
|---|---|---|
| When | Before launch (wireframes, mockups) | After launch (live pages) |
| What it measures | Opinions and first impressions | Real behavior (clicks, signups, purchases) |
| Sample size | 20-30 people | Hundreds to thousands of visitors |
| Data type | Mostly qualitative (“I prefer this because…”) | Purely quantitative (conversion rates) |
The User Interviews field guide recommends 20 to 30 participants for a reliable preference test. You can analyze results with a chi-square test. It just tells you if the preference is real or random noise.
The catch
People saying they prefer something doesn’t always predict what they’ll actually do. A 2025 study in Tourism Management found that survey-based tests now account for over 40% of studies in some fields. But conclusions based on stated preferences can be systematically wrong. People say one thing and do another.
That’s why preference testing works best as a filter, not a final answer. Use it to narrow down your options. Then test the winner with real behavior data.
Our take: Preference testing is great for picking between two logo options or landing page layouts before you build them. It’s not great for deciding which checkout flow converts better. For that, you need real traffic and real behavior.
Incrementality testing: does this campaign even matter?
This one is different from the methods above. A/B testing compares Version A to Version B. Incrementality testing asks a bigger question: “Did this campaign actually cause new sales, or would those customers have bought anyway?”
That distinction matters more than you’d think.
Here’s an example. You’re spending $5,000 a month on Facebook ads. Your dashboard shows 200 conversions from that campaign. Great, right? But how many of those 200 people would have found you through Google, direct visits, or word of mouth anyway? If the answer is 180, your ads only drove 20 new conversions. Your real return on ad spend is much lower than it looks.
Incrementality testing uses holdout groups to measure the real lift. A holdout group is a set of people who deliberately don’t see the ad. It’s like a control group for your entire marketing channel.
The data backs it up
Braun and Schwartz analyzed 181,890 A/B tests on Meta’s ad platform for a 2025 study in the Journal of Marketing. The platform’s delivery algorithm doesn’t actually randomize users. It routes different ads to different types of people. So the “winning” ad might have won because it reached more receptive audiences, not because it was better.
Incrementality tests (Meta calls them “lift tests”) didn’t show this problem. When you hold back an entire group from seeing any ad, the delivery algorithm can’t bias the results.
This isn’t a niche concern. 71% of advertisers now call incrementality their most important metric for retail media investments, according to a 2024 ANA survey. And roughly 52% of US marketers now use some form of incrementality testing. A few years ago, almost nobody did.
When incrementality testing is the better choice
- Evaluating whether a whole channel (Facebook, Google, email) drives net-new revenue
- Proving marketing ROI to leadership
- Working in a privacy-first world where individual user tracking is getting harder
- Making budget allocation decisions (“Should we spend more on Search or Display?”)
When A/B testing is better
- Optimizing individual page elements (headlines, buttons, images)
- Comparing specific design variations
- You have enough traffic and want solid conversion comparisons
The two methods aren’t competitors. They answer different questions at different levels.
Blue-green deployment: safe releases, not testing
This one comes up in A/B testing conversations a lot, usually because the terms sound similar. But blue-green deployment and A/B testing solve completely different problems.
Blue-green deployment means running two identical versions of your website infrastructure (the “blue” environment and the “green” environment). One is live. The other holds the new version. When you’re ready, you switch traffic from blue to green. If something breaks, you switch back instantly.
It’s a safety net for code releases. Not a way to measure user behavior.
| Blue-green deployment | A/B testing | |
|---|---|---|
| Purpose | Deploy safely with zero downtime | Measure which version users prefer |
| What it tests | ”Does this new code work?" | "Does this design convert better?” |
| Traffic split | All-or-nothing switch | Percentage-based split |
| Audience | Engineering/DevOps teams | Marketing/product teams |
| Rollback | Instant (flip the switch) | End the test, pick the winner |
A related approach is canary releases: rolling out to a small percentage of visitors first, then expanding. That’s closer to A/B testing in mechanics, but the goal is still risk reduction, not conversion measurement.
For the full breakdown on how deployment strategies relate to testing, see our guide on feature flags vs A/B testing.
Where the two worlds overlap: you might use blue-green deployment to safely release the winning version from an A/B test. But the deployment itself isn’t the test.
Hypothesis testing: the bigger picture
If someone asks you “what’s the difference between A/B testing and hypothesis testing?”, the short answer is simple. A/B testing is one specific kind of hypothesis testing. It’s like asking the difference between a golden retriever and a dog.
Hypothesis testing is the broad statistical idea of making a prediction, then collecting data to see if it’s true. There are many types:
- A/B test: show two versions to real users, see which performs better
- t-test, z-test: compare averages between two groups in existing data
- Chi-square test: check if there’s a meaningful relationship between categories
- ANOVA: compare averages across three or more groups
In everyday marketing, when people say “hypothesis testing,” they usually mean designing a marketing experiment with a clear prediction. “I think changing the headline from ‘Welcome’ to ‘Start your free trial’ will increase signups by 10%.” The A/B test is how you prove it. The null hypothesis is the assumption that nothing changed.
You don’t need to understand all these statistical methods. What matters is this: A/B testing is one way to test a hypothesis. Can’t run an A/B test? Not enough traffic, no live page, or not a comparison question? There are other ways to test your ideas. Customer interviews, usability sessions, surveys, analytics reviews. The hypothesis matters more than the method.
More methods worth knowing
The five methods above cover the most common alternatives. But there are a few more that come up in practice, and they’re worth a quick look.
Multi-armed bandits
Instead of splitting traffic 50/50 and waiting, a bandit algorithm automatically sends more traffic to whichever version is performing better. It “learns” during the test.
Spotify replaced standard A/B testing on their homepage with contextual bandits and saw 36.6% better efficiency (more clicks from the same views). A peer-reviewed study (Xiang & West, KDD 2022) found bandits reduce wasted traffic by 30-60% compared to traditional A/B tests.
Best for: continuous optimization where you care more about overall results than learning which version is definitively better. For a deeper look, see our guide to multi-armed bandit testing.
Causal inference methods
These are ways to estimate cause-and-effect from observational data (data you already have) without running a controlled test. The main ones:
- Synthetic control: builds a statistical “twin” of the group that got the change, using other groups that didn’t. Google’s CausalImpact tool makes this accessible.
- Difference-in-differences: compares how two groups change over time, before and after an intervention.
- Regression discontinuity: measures the effect of a change at a specific cutoff point (like a loyalty tier threshold). A 2016 study validated it against a randomized controlled trial and found comparable results.
These sound academic, but Microsoft, Google, and Amazon use them regularly when traditional A/B testing isn’t practical.
Heuristic evaluation (expert review)
A conversion rate optimization expert reviews your site against a checklist of best practices. No users, no traffic, no waiting. Resources like the Baymard Institute and GoodUI publish thousands of research-backed guidelines.
Best for: quick wins when you have almost no traffic and need to improve your site right now.
Customer surveys and interviews
Ask your customers directly: what almost stopped you from buying? What was confusing? What would make this easier? The Financial Times used 27 user interviews, 150 diary entries, and a 471-person survey to inform their app redesign. Zero A/B tests.
Session recordings and heatmaps
Watch what real visitors do on your site: where they click, how far they scroll, where they rage-click. Tools like Hotjar and FullStory show you patterns without needing a controlled test.
There’s also A/A testing as a validation method: running both versions identical to confirm your tool isn’t generating false winners before you start real experiments.
If you’re already using one of these, you’re already doing a form of conversion research. That’s not a lesser alternative to A/B testing. That’s the foundation A/B testing should be built on.
How to pick the right method for your situation
Here’s the decision framework. Find your situation, get your method.

And the detailed breakdown:
| Your situation | Best method | Why |
|---|---|---|
| Low traffic (under 1,000/month) | Usability testing, heuristic review, customer interviews | You’ll get actionable insights in days, not months |
| Need the “why” behind behavior | Usability testing, session recordings, surveys | Qualitative data explains the numbers |
| Evaluating channel ROI | Incrementality testing | A/B testing can’t answer “does this whole channel matter?” |
| Comparing early-stage designs | Preference testing | Test before you build |
| Deploying new features safely | Blue-green deployment, canary releases | Different problem entirely |
| Optimizing live pages with decent traffic | A/B testing | Still the gold standard for conversion decisions |
| Continuous optimization | Multi-armed bandits | Learn and optimize simultaneously |
Most teams don’t need just one method. Use usability testing to find problems. Use customer interviews to understand motivation. Then, when you have enough traffic, A/B test your best ideas with Kirro and let the numbers pick the winner. (Here’s how Kirro works if you’re curious.)
That’s not hedging. It’s how companies like Booking.com (which runs thousands of tests per year) and Spotify actually work. They use the right tool for each question.
For more on how A/B testing and multivariate testing compare, or how to handle the statistics behind testing, we’ve got deeper guides on those too.
FAQ
What are alternatives to A/B testing?
The main alternatives are: usability testing (watching people use your site), preference testing (asking which design they prefer), incrementality testing (measuring whether a campaign caused real results), and multi-armed bandits (shifting traffic toward winners automatically). There’s also heuristic evaluation (expert review), customer surveys, and session recordings. Each answers a different question. A/B testing asks “which version converts better?” These methods answer why people behave a certain way, whether a campaign matters, and what to build.
Is A/B testing necessary?
Not always. If your site gets fewer than about 1,000 monthly visitors, the math doesn’t work for traditional A/B testing. You’d wait months or years for a reliable result. Bayesian approaches can help with smaller samples. But usability testing and expert reviews often give you faster, more useful answers. Even at well-funded companies, only about one in three A/B tests produce positive results. And researchers Amano and Joo argue in HBR that the p < 0.05 confidence standard can require 24 to 55 times more data than alternative decision methods. That’s a lot of wasted traffic.
What is the alternative name for A/B testing?
A/B testing is also called split testing, bucket testing, or just a controlled experiment. They all mean the same thing: showing two (or more) versions to different visitors and comparing the results.
What are the 4 types of tests?
The four main types are: A/B tests (comparing two versions), multivariate tests (testing multiple elements at once), split URL tests (sending traffic to entirely different page URLs), and multi-page tests (testing changes across a funnel). For a detailed comparison, see our guide to A/B testing vs multivariate testing.
Can you do A/B testing with low traffic?
Technically, yes. Bayesian A/B testing methods update confidence continuously instead of waiting for a fixed sample size. That helps with smaller traffic. But “works” and “works well” are different things. With very low traffic (under 500 monthly visitors), even Bayesian methods take a long time to reach useful confidence. Usability testing, heuristic review, and customer research will give you faster, clearer answers. Use those to make improvements, grow your traffic, and then start A/B testing.
Randy Wattilete
CRO expert and founder with nearly a decade running conversion experiments for companies from early-stage startups to global brands. Built programs for Nestlé, felyx, and Storytel. Founder of Kirro (A/B testing).
View all author posts