Z-test Calculator
Free z-test calculator with step-by-step breakdown. One-sample, two-sample, one-proportion, two-prop
Why This Statistical Analysis Matters
Why: Statistical calculator for analysis.
How: Enter inputs and compute results.
Z-Test — Hypothesis Testing with Known σ
One-sample, two-sample, proportion tests. p-value, confidence interval, effect size, power. Step-by-step breakdown with interactive visualization.
Real-World Scenarios — Click to Load
Test Configuration
One-Sample Data (σ known)
95% Confidence Interval
Red line = null value (100). CI includes null → not significant.
Standard Normal Distribution: Rejection Region & p-value
Power vs Sample Size
Effect Size vs Cohen's Benchmarks
Calculation Breakdown
⚠️For educational and informational purposes only. Verify with a qualified professional.
Key Takeaways
- • The z-test is used when the population standard deviation σ is known or the sample size is very large (n > 30)
- • The p-value is the probability of observing your test statistic (or more extreme) if H₀ were true
- • Reject H₀ when p-value < α. The confidence interval provides the range of plausible parameter values
- • Effect size (Cohen's d for means, h for proportions) measures practical significance, not just statistical
- • Power = P(reject H₀ | H₁ true). Target 80% power when designing studies — use the power curve to plan sample size
- • For proportion tests, the normal approximation requires np ≥ 5 and n(1−p) ≥ 5 to be valid
Did You Know?
Expert Tips
z vs t: The Decision Rule
Use z when σ is known from specifications, historical data, or standardized tests (IQ, SAT). Use t when σ is estimated from your sample. For n > 120, the difference is negligible.
One-Tailed vs Two-Tailed
Only use one-tailed tests when the direction was specified BEFORE seeing data. Post-hoc switching from two-tailed to one-tailed inflates Type I error and is considered p-hacking.
Interpreting Non-Significance
"Fail to reject H₀" does NOT mean H₀ is true. Check your power — if power is low, you may simply lack the sample size to detect the effect. The CI width reveals precision.
Multiple Testing Correction
Running multiple z-tests inflates the familywise error rate. For k tests, the Bonferroni correction uses α/k. See our Bonferroni Correction Calculator.
When to Use Each Test Type
| Scenario | Test Type | Example |
|---|---|---|
| Mean vs known value, σ known | One-sample z | IQ scores vs μ=100 (σ=15) |
| Two means, both σ known | Two-sample z | Drug vs placebo BP (large trial) |
| Proportion vs target | One-proportion z | Election poll: >50% support? |
| Two proportions | Two-proportion z | A/B test conversion rates |
| Mean vs known value, σ unknown | Use t-test instead | Small sample customer satisfaction |
| Non-normal data, small n | Use Mann-Whitney U | Ordinal or skewed data |
Why Use This Calculator vs. Other Tools?
| Feature | This Calculator | R / Python | Excel |
|---|---|---|---|
| One-sample, two-sample, proportion z-tests | ✅ | ✅ | ⚠️ Manual |
| Interactive normal curve visualization | ✅ | ⚠️ Code needed | ❌ |
| Power analysis & sample size curve | ✅ | ✅ (pwr) | ❌ |
| Effect size with benchmarks | ✅ | ⚠️ Manual | ❌ |
| Step-by-step calculation breakdown | ✅ | ❌ | ❌ |
| Copy & share results | ✅ | ❌ | ❌ |
| AI-powered interpretation | ✅ | ❌ | ❌ |
| No installation / no coding required | ✅ | ❌ | ✅ |
Frequently Asked Questions
When should I use a z-test instead of a t-test?
Use the z-test when the population standard deviation σ is known (e.g., from standardized tests, manufacturing specs, or historical data). Use the t-test when σ is unknown and estimated from your sample. For very large samples (n > 120), the results are virtually identical.
What does the p-value actually mean?
The p-value is the probability of observing a test statistic as extreme as (or more extreme than) yours, assuming H₀ is true. A small p-value (< α) means the data is unlikely under H₀, providing evidence against it. It does NOT measure the probability that H₀ is true.
What is the difference between statistical and practical significance?
Statistical significance (p < α) means the effect is unlikely due to chance. Practical significance means the effect is large enough to matter in the real world. A huge sample can make a tiny, meaningless effect statistically significant. Always report effect sizes alongside p-values.
How do I interpret the confidence interval?
A 95% CI means: if we repeated this study many times, 95% of the intervals would contain the true parameter. If the CI for a difference excludes zero (or the null value), the test is significant at that α level. Wider CIs indicate less precision.
What is statistical power and why does it matter?
Power is the probability of correctly rejecting H₀ when H₁ is true. Low power (< 80%) means you might miss real effects (Type II error). Use the power curve to plan your sample size before collecting data.
Can I use the z-test for small samples?
The z-test requires that the sampling distribution is approximately normal. For means, this holds when the population is normal or n is large (CLT, typically n ≥ 30). For proportions, you need np ≥ 5 and n(1−p) ≥ 5. For small samples with unknown σ, use the t-test.
What are Type I and Type II errors?
Type I error (α): rejecting H₀ when it's actually true (false positive). Type II error (β): failing to reject H₀ when H₁ is true (false negative). Power = 1 − β. You control α by choosing your significance level; you control β by increasing sample size.
Should I use one-tailed or two-tailed?
Use two-tailed unless you have a strong, pre-specified directional hypothesis. One-tailed tests have more power for that direction but cannot detect effects in the opposite direction. Switching from two-tailed to one-tailed after seeing data is considered p-hacking.
Z-Test by the Numbers
Official Data Sources
Disclaimer: This calculator is for educational and research planning purposes. It uses the Abramowitz & Stegun normal CDF approximation (accuracy ≈ 7.5×10⁻⁸). For publishable research, verify results with established statistical software (R, Python scipy, SAS, SPSS). Always check assumptions: known σ, independence, normality (or large n via CLT), and adequate np for proportion tests. Not professional statistical consulting advice.