AB Test Calculator
Free A/B test calculator. Two-proportion z-test, sample size, power, confidence intervals, Bayesian
Why This Statistical Analysis Matters
Why: Statistical calculator for analysis.
How: Enter inputs and compute results.
A/B Test — Statistical Significance & Power
Two-proportion z-test, sample size, power, CI, Bayesian P(B>A). Conversion rate comparison with step-by-step breakdown.
Real-World Scenarios — Click to Load
View Mode
Control (A)
Variant (B)
Conversion Rate Comparison
Power Curve vs Sample Size
Power vs sample size for p₁=0.050, p₂=0.065. 80% threshold shown.
95% Confidence Interval for Difference
Red line = 0. CI includes 0 → not significant. CI excludes 0 → significant.
Calculation Breakdown
For educational and informational purposes only. Verify with a qualified professional.
Key Takeaways
- • Two-proportion z-test: p̂_A = x_A/n_A, p̂_B = x_B/n_B. Pooled p̂ for SE. z = (p̂_B − p̂_A) / SE.
- • p-value: 2×(1−Φ(|z|)) for two-sided; 1−Φ(z) for one-sided. Reject H₀ if p < α.
- • Relative lift: (p̂_B − p̂_A) / p̂_A × 100%. Absolute difference: p̂_B − p̂_A.
- • 95% CI: (p̂_B − p̂_A) ± 1.96 × SE. If 0 is outside CI, difference is significant.
- • Sample size: n = (z_α/2 + z_β)² × (p₁(1−p₁) + p₂(1−p₂)) / (p₂−p₁)² per group.
- • Power: Probability of detecting a true effect. Aim for 80%+.
- • Bayesian: P(B > A) from Beta posteriors. Interpret as probability variant beats control.
Did You Know?
Expert Tips
Plan Sample Size First
Use Sample Size mode before running a test. Aim for 80% power. Stopping early with low power means you may miss real effects.
One vs Two-Sided
Use two-sided unless you would never act on B being worse than A. One-sided gives more power for that direction but cannot detect harm.
Statistical vs Practical Significance
A result can be statistically significant but practically trivial. 5.00% vs 5.01% with 1M visitors — significant, but 0.01% lift may not justify the change.
Multiple Variants
Testing A vs B vs C inflates false positives. Use Bonferroni correction (α/k) for k comparisons.
This Calculator vs Google Optimize vs Manual
| Feature | This Calculator | Google Optimize | Manual (Excel/R) |
|---|---|---|---|
| Two-proportion z-test | ✅ | ✅ | ⚠️ Manual |
| Sample size estimation | ✅ | ✅ | ⚠️ Manual |
| Power analysis | ✅ | ❌ | ✅ |
| Bayesian P(B>A) | ✅ | ✅ | ⚠️ Code needed |
| Step-by-step breakdown | ✅ | ❌ | ❌ |
| CI visualization | ✅ | ❌ | ❌ |
| Copy & share results | ✅ | ❌ | ❌ |
| No platform lock-in | ✅ | ❌ | ✅ |
Frequently Asked Questions
What sample size do I need?
Use the Sample Size mode. Enter expected control rate (p1), expected variant rate (p2), α (usually 0.05), and desired power (usually 0.8).
What does P(B > A) mean?
Bayesian probability that the variant conversion rate exceeds the control. 95% means strong evidence variant is better.
When is a result significant?
When p-value < α (e.g., 0.05). Also when the 95% CI for the difference excludes 0.
How do I interpret relative lift?
Relative lift = (pB−pA)/pA × 100%. E.g., 30% lift means variant converts 30% more often relative to control.
What is statistical power?
Probability of correctly rejecting H₀ when there is a true effect. 80% power means 80% chance to detect the specified lift.
Can I use this for click-through rate?
Yes. Enter impressions as n and clicks as x. The two-proportion z-test works for any binary outcome.
When should I use Fisher exact test instead?
For small samples (np < 5 or n(1−p) < 5), the normal approximation is poor. Use Fisher exact test for small counts.
How long should I run an A/B test?
Run for at least 1–2 full business cycles (e.g., week) to capture day-of-week effects. Ensure equal traffic split.
A/B Testing by the Numbers
Official Data Sources
Disclaimer: This calculator provides statistical guidance. Business decisions should consider effect size, cost, and risk — not just p-values. Z-test assumes large samples (np and n(1−p) ≥ 5). For small samples, use Fisher exact test.
Related Calculators
Confidence Interval Calculator
Calculate confidence intervals for means, proportions, and differences. Z-intervals, t-intervals, and sample size planning.
StatisticsPower Analysis Calculator
Compute statistical power, required sample size, or minimum detectable effect size for t-tests, proportions, ANOVA, and correlation.
StatisticsHypothesis Testing Calculator
Comprehensive hypothesis testing: one-sample z/t, two-sample t, paired t, one/two-proportion z. Test statistic, p-value, confidence interval, decision...
StatisticsMargin of Error Calculator
Computes margin of error for surveys and polls. Handles proportions and means, determines required sample size, and shows confidence interval.
StatisticsCohen's D Calculator
Calculates Cohen's d effect size for comparing two group means. Includes confidence interval, interpretation (small/medium/large), and related effect sizes.
StatisticsZ-test Calculator
Performs z-tests: one-sample (σ known), two-sample, one-proportion, two-proportion. CI, effect size, power.
Statistics