Question 1

What is statistical power in A/B testing?

Accepted Answer

Power is the probability that your test will detect a real difference when one exists. At 80% power, if Version B truly converts 20% better than Version A, your test has an 80% chance of correctly identifying B as the winner. The remaining 20% is the risk of a [Type II error](/type-1-vs-type-2-errors-in-a-b-testing): concluding "no difference" when there actually is one. It's the false negative rate of your test.

Question 2

What power level should I aim for?

Accepted Answer

80% is the standard for most A/B tests. It balances reliability with practical sample size requirements. Aim for 90% if the test results will drive a major decision (full site redesign, pricing change, major product launch). Below 80%, your test has a meaningful chance of missing real improvements. Below 50%, you're basically flipping a coin.

Question 3

How does power relate to sample size?

Accepted Answer

They move together. More visitors equals more power. If your power is too low, the fix is almost always "get more visitors" (by running the test longer or on higher-traffic pages). The [sample size calculator](/sample-size-calculator) works this relationship in reverse: given your desired power level, it tells you exactly how many visitors you need.

Question 4

My test had low power. Are the results useless?

Accepted Answer

Not entirely. If a low-power test finds a significant result, that result is still valid. Low power doesn't create false positives. What low power does is increase the chance of false negatives. So if your underpowered test says "no difference found," you can't be sure there wasn't one. You just didn't have enough data to see it. Think of it this way: a metal detector with low sensitivity won't create fake signals, but it might miss real ones buried deep.

Question 5

What's the relationship between power and the minimum detectable effect?

Accepted Answer

Inverse. Smaller MDE needs more power (and more visitors) to detect. Larger MDE needs less. If your power is too low and you can't get more traffic, consider increasing your MDE. In practice, that means testing bigger changes. Instead of tweaking button color (small effect, hard to detect), test a completely different headline (large effect, easier to detect). The [CUPED method](/cuped-variance-reduction) can also help by reducing variance, effectively boosting power without more traffic.

Question 6

Why don't most A/B testing tools show power?

Accepted Answer

Because low power is an uncomfortable truth. Most tools show you whether your result is significant but don't tell you whether your test was capable of detecting a difference in the first place. It's like checking if your net caught a fish without asking whether the net had holes in it. Power analysis is the "before" to significance testing's "after." Do it before you start the test using this calculator and the [sample size calculator](/sample-size-calculator) together. **Need more power?** [Kirro](https://app.kirro.io/auth/sign-up) uses Bayesian statistics, which work faster with small traffic. You get reliable answers sooner.

Free A/B Test Power Calculator

Results

How to use this calculator

How we calculate this

FAQ

What is statistical power in A/B testing?

What power level should I aim for?

How does power relate to sample size?

My test had low power. Are the results useless?

What’s the relationship between power and the minimum detectable effect?

Why don’t most A/B testing tools show power?

Related reading: Testing Methodology

11 A/B testing mistakes that quietly kill your results

A/B test sample size formula: how to calculate it (with worked examples)

A/B testing conversion rate: how to measure, track, and actually improve it

Try Kirro