Question 1

What's a good baseline conversion rate?

Accepted Answer

Whatever yours is. Don't aim for a benchmark when measuring your own baseline. Use your real number from analytics. Pull the last 30 days from GA4 or similar. If you don't have 30 days of data, you don't have enough to A/B test reliably. Come back later.

Question 2

How do I pick a minimum detectable lift?

Accepted Answer

Start with the industry/funnel benchmark. It tells you the gap between your baseline and a realistic target for similar businesses. If you're at 2% and the median is 4%, that's a +100% lift. Aiming for that means you'd confirm any winner that gets you most of the way there. Sanity-check it against your roadmap: if a +10% lift wouldn't change what you ship next, set the bar higher and test a bolder change.

Question 3

Why does asking for a smaller lift balloon the sample size?

Accepted Answer

Sample size scales with 1/lift². It's quadratic. Going from a +20% to a +10% lift roughly quadruples the visitors needed. That's why honest testing tools push you toward bigger, bolder swings. This is one of the most common [A/B testing mistakes](/a-b-testing-mistakes): setting an MDE so small that the test can never finish.

Question 4

Can I stop the test early if I see significance?

Accepted Answer

Not with a fixed-horizon test, which is what this calculator powers. [Peeking](/null-hypothesis-ab-testing) inflates your false-positive rate from 5% to as much as 30%. If you want to peek, use [sequential testing](/sequential-testing) (mSPRT) or a [Bayesian](/bayesian-ab-testing) tool. Kirro does sequential by default, so you can.

Question 5

What if my traffic is too low for any reasonable lift?

Accepted Answer

Honest answer: you may not be a candidate for A/B testing on the metric you've picked. Try testing a higher-funnel metric (clicks instead of purchases), or commit to bigger swings. Redesigns, not button colors. The duration table on this page shows the trade-off directly. Lower-traffic sites need to aim for bigger lifts or accept longer test windows.

Question 6

Does this work for revenue tests?

Accepted Answer

No. Revenue is continuous and skewed, so a proportion-based z-test doesn't apply. You'd want a Welch's t-test or Mann–Whitney U on per-visitor revenue. Most marketing teams test conversion rate (binary) and read revenue as a secondary metric.

Question 7

Is the math the same as Evan Miller's calculator?

Accepted Answer

Yes. Two-proportion z-test, two-tailed, equal allocation. Same z-tables, same formula. Same answer to within rounding.

Question 8

Are the benchmark numbers reliable?

Accepted Answer

They're illustrative placeholders to help you start. Treat them as "roughly the right shape" for each industry, not as research you can quote. Replace them with numbers from your own analytics or a recent industry report before using them in a deck.

Question 9

Why share a calculator with URL params?

Accepted Answer

Because when someone says "you need 13,000 visitors per variation," the next thing you ask is "based on what?" A shareable link encodes the inputs in the URL so the assumptions are right there in the receipts. **Ready to run this test?** Set it up in [Kirro](https://app.kirro.io/auth/sign-up) in about three minutes. No code, no developer.

If you run for	The new version would need to reach	That's a lift of	Visitors collected
3 days	2.50%→5.30%	+112%	1,500
7 days	2.50%→4.20%	+68%	3,500
14 days	2.50%→3.66%	+46%	7,000
21 days	2.50%→3.43%	+37%	10,500
28 days	2.50%→3.29%	+32%	14,000
56 days	2.50%→3.05%	+22%	28,000
68 daysyour plan	2.50%→3.00%	+20%	34,000

How many visitors does my A/B test need?

How long should you actually run this?

How to use this calculator

How the math works

The formula

What “relative lift” means here

The four levers

How “working backwards” works

Edge cases handled honestly

What this calculator deliberately doesn’t do

About the benchmarks

FAQ

What’s a good baseline conversion rate?

How do I pick a minimum detectable lift?

Why does asking for a smaller lift balloon the sample size?

Can I stop the test early if I see significance?

What if my traffic is too low for any reasonable lift?

Does this work for revenue tests?

Is the math the same as Evan Miller’s calculator?

Are the benchmark numbers reliable?

Related reading: Testing Methodology

11 A/B testing mistakes that quietly kill your results

A/B test sample size formula: how to calculate it (with worked examples)

A/B testing conversion rate: how to measure, track, and actually improve it

Launch your A/B test for free