What does this calculator do?

Compute statistical results from your inputs.

Enter values in the input fields and view results.

What are the formulas?

See the educational content section for formulas.

For statistical analysis and computation.

Check the related calculators section below.

Results follow standard statistical methods.

4 more

STATISTICSInference & TestsStatistics Calculator

📊

AB Test Calculator

Free A/B test calculator. Two-proportion z-test, sample size, power, confidence intervals, Bayesian

Run CalculatorExplore data analysis and statistical calculations

Why This Statistical Analysis Matters

Why: Statistical calculator for analysis.

How: Enter inputs and compute results.

A/B

STATISTICSInference & Tests

A/B Test — Statistical Significance & Power

Two-proportion z-test, sample size, power, CI, Bayesian P(B>A). Conversion rate comparison with step-by-step breakdown.

Z-Test →Sample Size →

Real-World Scenarios — Click to Load

View Mode

Mode

Significance level α

Test type

Control (A)

Visitors

Conversions

Variant (B)

Visitors

Conversions

ab_test_results.sh

CALCULATED

$ ab_test --control=1000,50 --variant=1000,65 --alpha=0.05

Decision

NOT SIGNIFICANT

z-statistic

1.4408

p-value

0.2349

Relative Lift

30.0%

Absolute Diff

0.0150

95% CI

[-0.0054, 0.0354]

Sample Size Needed

3778/group

Power

53.0%

Bayesian P(B>A)

88.3%

A/B Test Result

Conversion Rate Comparison

✗ Not Significant

z = 1.441p = 0.2349Lift: 30.0%P(B>A): 88%

numbervibe.com/calculators/statistics/ab-test-calculator

Conversion Rate Comparison

Power Curve vs Sample Size

Power vs sample size for p₁=0.050, p₂=0.065. 80% threshold shown.

95% Confidence Interval for Difference

-0.0054Estimate: 0.01500.0354

Red line = 0. CI includes 0 → not significant. CI excludes 0 → significant.

Calculation Breakdown

COMPUTATION

p̂_A (Control)

0.0500

x_A/n_A = 50/1000

p̂_B (Variant)

0.0650

x_B/n_B = 65/1000

Pooled proportion

0.0575

(x_A+x_B)/(n_A+n_B) = 115/2000

Standard Error

0.0104

√(p̂(1-p̂)(1/n_A+1/n_B))

z-statistic

1.4408

(p̂_B − p̂_A)/SE = (0.0650 − 0.0500)/SE

p-value

0.2349

2(1 - Φ(|z|)) ext{for} ext{two}- ext{sided}

DECISION

NOT SIGNIFICANT — Fail to reject H₀

EFFECT SIZE

Relative Lift

30.0%

(p̂_B - p̂_A)/p̂_A imes 100

Absolute Difference

0.0150

p̂_B - p̂_A

CONFIDENCE INTERVAL

95% CI for difference

[-0.0054, 0.0354]

Bayesian P(B > A)

88.3%

ext{Normal} approximation ext{to} ext{Beta} ext{posteriors}

For educational and informational purposes only. Verify with a qualified professional.

Key Takeaways

• Two-proportion z-test: p̂_A = x_A/n_A, p̂_B = x_B/n_B. Pooled p̂ for SE. z = (p̂_B − p̂_A) / SE.
• p-value: 2×(1−Φ(|z|)) for two-sided; 1−Φ(z) for one-sided. Reject H₀ if p < α.
• Relative lift: (p̂_B − p̂_A) / p̂_A × 100%. Absolute difference: p̂_B − p̂_A.
• 95% CI: (p̂_B − p̂_A) ± 1.96 × SE. If 0 is outside CI, difference is significant.
• Sample size: n = (z_α/2 + z_β)² × (p₁(1−p₁) + p₂(1−p₂)) / (p₂−p₁)² per group.
• Power: Probability of detecting a true effect. Aim for 80%+.
• Bayesian: P(B > A) from Beta posteriors. Interpret as probability variant beats control.

Did You Know?

📊Most A/B tests are underpowered. With 80% power, you need ~1000+ per group for 5%→6.5% lift at α=0.05.Source: Evan Miller

🔄Sequential testing (O'Brien-Fleming) allows early stopping but requires adjusted boundaries.Source: Optimizely Stats Engine

🎯Bayesian A/B tests give P(B beats A) directly — easier to interpret than p-values for stakeholders.Source: VWO Best Practices

⚠️Peeking at results without adjustment inflates Type I error. Use sequential methods or wait for planned sample size.Source: Kohavi et al., 2009

📈Relative lift can be misleading when baseline is low. 1%→2% is 100% relative lift but small absolute gain.Source: Google Analytics

🔬Minimum detectable effect (MDE) = smallest lift you can detect with given power and sample size.Source: Evan Miller

Expert Tips

Plan Sample Size First

Use Sample Size mode before running a test. Aim for 80% power. Stopping early with low power means you may miss real effects.

One vs Two-Sided

Use two-sided unless you would never act on B being worse than A. One-sided gives more power for that direction but cannot detect harm.

Statistical vs Practical Significance

A result can be statistically significant but practically trivial. 5.00% vs 5.01% with 1M visitors — significant, but 0.01% lift may not justify the change.

Multiple Variants

Testing A vs B vs C inflates false positives. Use Bonferroni correction (α/k) for k comparisons.

This Calculator vs Google Optimize vs Manual

Feature	This Calculator	Google Optimize	Manual (Excel/R)
Two-proportion z-test	✅	✅	⚠️ Manual
Sample size estimation	✅	✅	⚠️ Manual
Power analysis	✅	❌	✅
Bayesian P(B>A)	✅	✅	⚠️ Code needed
Step-by-step breakdown	✅	❌	❌
CI visualization	✅	❌	❌
Copy & share results	✅	❌	❌
No platform lock-in	✅	❌	✅

Frequently Asked Questions

What sample size do I need?

Use the Sample Size mode. Enter expected control rate (p1), expected variant rate (p2), α (usually 0.05), and desired power (usually 0.8).

What does P(B > A) mean?

Bayesian probability that the variant conversion rate exceeds the control. 95% means strong evidence variant is better.

When is a result significant?

When p-value < α (e.g., 0.05). Also when the 95% CI for the difference excludes 0.

How do I interpret relative lift?

Relative lift = (pB−pA)/pA × 100%. E.g., 30% lift means variant converts 30% more often relative to control.

What is statistical power?

Probability of correctly rejecting H₀ when there is a true effect. 80% power means 80% chance to detect the specified lift.

Can I use this for click-through rate?

Yes. Enter impressions as n and clicks as x. The two-proportion z-test works for any binary outcome.

When should I use Fisher exact test instead?

For small samples (np < 5 or n(1−p) < 5), the normal approximation is poor. Use Fisher exact test for small counts.

How long should I run an A/B test?

Run for at least 1–2 full business cycles (e.g., week) to capture day-of-week effects. Ensure equal traffic split.

A/B Testing by the Numbers

~3000

Per group for 5%→6.5% at 80% power

1.96

z* for 95% CI (two-sided)

80%

Recommended minimum power

0.05

Standard significance level

Official Data Sources

Evan Miller's A/B Test Calculator ↗

Classic sample size and significance calculator

Updated: 2024

Optimizely Stats Engine ↗

Sequential testing and Bayesian methods

Updated: 2025

Google Analytics Experiments ↗

A/B testing in Google Analytics

Updated: 2025

VWO A/B Testing Guide ↗

Conversion optimization best practices

Updated: 2025

Disclaimer: This calculator provides statistical guidance. Business decisions should consider effect size, cost, and risk — not just p-values. Z-test assumes large samples (np and n(1−p) ≥ 5). For small samples, use Fisher exact test.

👈 START HERE

⬅️Jump in and explore the concept!

AB Test Calculator

Why This Statistical Analysis Matters

A/B Test — Statistical Significance & Power

Real-World Scenarios — Click to Load

View Mode

Control (A)

Variant (B)

Conversion Rate Comparison

Power Curve vs Sample Size

95% Confidence Interval for Difference

Calculation Breakdown

Key Takeaways

Did You Know?

Expert Tips

Plan Sample Size First

One vs Two-Sided

Statistical vs Practical Significance

Multiple Variants

This Calculator vs Google Optimize vs Manual

Frequently Asked Questions

What sample size do I need?

What does P(B > A) mean?

When is a result significant?

How do I interpret relative lift?

What is statistical power?

Can I use this for click-through rate?

When should I use Fisher exact test instead?

How long should I run an A/B test?

A/B Testing by the Numbers

Official Data Sources

Related Calculators

Confidence Interval Calculator

Power Analysis Calculator

Hypothesis Testing Calculator

Margin of Error Calculator

Cohen's D Calculator

Z-test Calculator

We Value Your Privacy

AB Test Calculator

Why This Statistical Analysis Matters

A/B Test — Statistical Significance & Power

Real-World Scenarios — Click to Load

View Mode

Control (A)

Variant (B)

Conversion Rate Comparison

Power Curve vs Sample Size

95% Confidence Interval for Difference

Calculation Breakdown

Key Takeaways

Did You Know?

Expert Tips

Plan Sample Size First

One vs Two-Sided

Statistical vs Practical Significance

Multiple Variants

This Calculator vs Google Optimize vs Manual

Frequently Asked Questions

What sample size do I need?

What does P(B > A) mean?

When is a result significant?

How do I interpret relative lift?

What is statistical power?

Can I use this for click-through rate?

When should I use Fisher exact test instead?

How long should I run an A/B test?

A/B Testing by the Numbers

Official Data Sources

Related Statistics Calculators

Related Calculators

Confidence Interval Calculator

Power Analysis Calculator

Hypothesis Testing Calculator

Margin of Error Calculator

Cohen's D Calculator

Z-test Calculator

We Value Your Privacy