REGRESSIONDescriptive StatisticsStatistics Calculator
📊

Pearson Correlation — Linear Association

Compute r, R², t-test, 95% CI (Fisher z). Scatter plot, residuals, step-by-step.

Concept Fundamentals
Σ(x−x̄)(y−ȳ)/√(SS_x·SS_y)
Pearson r
Linear correlation
−1 to +1
Range
Direction & strength
Shared variance %
r² Meaning
Coefficient of determination
Linearity & normality
Assumption
Parametric measure
Compute Pearson rCorrelation & regression

Why This Statistical Analysis Matters

Why: Pearson r quantifies linear association. Essential for regression, hypothesis testing.

How: Enter (x,y) pairs. Get r, R², t-test, Fisher z CI, scatter plot.

  • r ∈ [-1,1]
  • R² = r²
  • Fisher z for CI
r
STATISTICSDescriptive Statistics

Pearson Correlation — Linear Association

Compute r, R², t-test, 95% CI (Fisher z). Scatter plot with regression line, residuals, step-by-step breakdown.

Paste from Clipboard

Real-World Scenarios — Click to Load

xy
pearson_results.sh
CALCULATED
$ pearson correlation --n=5
Pearson r
0.9948
Strong correlation (positive)
0.9897
t-statistic
17.0000
df = 3
p-value
0.00e+0
95% CI for r (Fisher z)
[0.9207, 0.9997]
Regression line
ŷ = 0.3000 + 1.7000x
Share:
Pearson Correlation Result
r = 0.995
Strong correlation (positive)
R² = 0.9897p = 0.00e+095% CI: [0.921, 1.000]
numbervibe.com/calculators/statistics/pearson-correlation-calculator

Scatter Plot with Regression Line

Residuals Plot

Calculation Breakdown

DATA
Sample size n
5
ext{Number} ext{of} ext{paired} ext{observations}
COMPUTATION
Mean x̄
3.0000
Σx/n = 15.00/5
Mean ȳ
5.4000
Σy/n = 27.00/5
Pearson r
0.9948
Σ(xᵢ−x̄)(yᵢ−ȳ) / √(Σ(xᵢ−x̄)² × Σ(yᵢ−ȳ)²)
0.9897
r^{2} = ext{coefficient} ext{of} ext{determination}
HYPOTHESIS TEST
t-statistic
17.0000
t = r√(n−2)/√(1−r²) = 0.9948√3/√(1−0.9897)
df
3
n - 2
p-value
0.00e+0
H_{0}: \text{rho} = 0
95% CI (FISHER Z)
Fisher z
2.9796
z = 0.5·ln((1+r)/(1−r))
SE(z)
0.7071
1/√(n−3) = 1/√2
95% CI for r
[0.9207, 0.9997]
ext{Back}- ext{transform} z pm 1.96 cdot ext{SE}(z)
REGRESSION
Regression line
ŷ = 0.3000 + 1.7000x
ext{Least}- ext{squares} ext{fit}

Step-by-Step Computation Table

ixᵢyᵢxᵢ−x̄yᵢ−ȳ(xᵢ−x̄)(yᵢ−ȳ)(xᵢ−x̄)²(yᵢ−ȳ)²
11.002.00-2.0000-3.40006.80004.000011.5600
22.004.00-1.0000-1.40001.40001.00001.9600
33.005.000.0000-0.40000.00000.00000.1600
44.007.001.00001.60001.60001.00002.5600
55.009.002.00003.60007.20004.000012.9600

r = Σ(xᵢ−x̄)(yᵢ−ȳ) / √(Σ(xᵢ−x̄)² × Σ(yᵢ−ȳ)²) = 17.0000 / √(10.0000 × 29.2000) = 0.9948

For educational and informational purposes only. Verify with a qualified professional.

📈 Statistical Insights

r

Correlation

— [-1,1]

— Variance

t

H₀: ρ=0

— Test

Key Takeaways

  • Pearson r: Measures linear correlation. r ∈ [−1, 1]. r = ±1 means perfect linear relationship.
  • Formula: r = [nΣxᵢyᵢ − ΣxᵢΣyᵢ] / √[(nΣxᵢ²−(Σxᵢ)²)(nΣyᵢ²−(Σyᵢ)²)]
  • R² = r²: Coefficient of determination — fraction of variance in y explained by x.
  • t-test: t = r√(n−2)/√(1−r²), df = n−2. Tests H₀: ρ = 0.
  • 95% CI: Fisher z-transform, then back-transform for r.
  • Regression line: y = a + bx where b = r×(sy/sx), a = ȳ − b×x̄.

Did You Know?

📊Karl Pearson developed the Pearson correlation in 1895. It's the most widely used measure of linear association in statistics.Source: Statistical History
📈R² tells you how much of the variation in y is 'explained' by x. R² = 0.81 means 81% of variance is explained by the linear model.Source: Regression Interpretation
🔬The p-value tests H₀: ρ = 0. A small p-value suggests the correlation is statistically significant (unlikely due to chance).Source: Hypothesis Testing
📐Pearson r assumes a linear relationship. For curved relationships, consider Spearman or polynomial regression.Source: Assumptions
🧪Correlation does not imply causation. Both variables could be influenced by a third factor.Source: Causation Principle
📉Residuals = observed − predicted. A good fit has residuals scattered randomly around zero with no pattern.Source: Model Fit

How Pearson r is Computed

Step 1: Compute means x̄ and ȳ.

Step 2: For each pair, compute (xᵢ − x̄)(yᵢ − ȳ). Sum to get covariance numerator.

Step 3: Compute Σ(xᵢ − x̄)² and Σ(yᵢ − ȳ)². Multiply and take square root for denominator.

Step 4: r = numerator / denominator. Always between −1 and 1.

Alternative formula: r = Σ((xᵢ−x̄)(yᵢ−ȳ)) / √(Σ(xᵢ−x̄)² × Σ(yᵢ−ȳ)²)

Hypothesis Test and Confidence Interval

t-test for H₀: ρ = 0

t = r√(n−2)/√(1−r²), df = n−2. p-value from t-distribution. Reject H₀ if p < α.

Fisher z-transform for CI

z = 0.5·ln((1+r)/(1−r)). 95% CI: z ± 1.96/√(n−3). Back-transform to get r interval.

Regression Line

The regression line is ŷ = a + bx. Slope b = r×(sy/sx) where sy, sx are standard deviations. Intercept a = ȳ − b×x̄. The line minimizes the sum of squared residuals.

Step-by-Step: Computing r

Step 1: Enter your paired (x, y) data. Compute x̄ = Σx/n and ȳ = Σy/n.

Step 2: For each pair, compute deviations: (xᵢ − x̄) and (yᵢ − ȳ).

Step 3: Compute products (xᵢ−x̄)(yᵢ−ȳ) and sum them for the covariance numerator.

Step 4: Compute Σ(xᵢ−x̄)² and Σ(yᵢ−ȳ)². Multiply and take √ for the denominator.

Step 5: r = numerator / denominator. Check: −1 ≤ r ≤ 1 always.

Residuals and Model Fit

Residuals = observed y − predicted ŷ. A good linear fit has residuals scattered randomly around zero with no pattern. If residuals show a curve (e.g., U-shape), the relationship may be nonlinear — consider transforming variables or using a different model. The sum of squared residuals is minimized by the least-squares regression line.

Interpretation Guide

|r|Strength
0 - 0.3Weak
0.3 - 0.7Moderate
0.7 - 1.0Strong

Frequently Asked Questions

What does a negative correlation mean?

r < 0 means as x increases, y tends to decrease. The relationship is inverse.

How do I interpret the p-value?

p < 0.05 typically means the correlation is statistically significant — unlikely to have occurred by chance if the true correlation were zero.

What is the confidence interval for r?

The 95% CI gives a range of plausible values for the true population correlation. If it includes 0, the correlation may not be significant.

Does correlation imply causation?

No. A high correlation can be due to a third variable, coincidence, or reverse causation. Always consider the study design.

When is Pearson r inappropriate?

When the relationship is nonlinear, data has outliers, or variables are ordinal. Consider Spearman or Kendall.

What is R²?

R² = r². It is the proportion of variance in y explained by the linear relationship with x. R² = 0.64 means 64% explained.

Formulas Reference

r = [nΣxᵢyᵢ − ΣxᵢΣyᵢ] / √[(nΣxᵢ²−(Σxᵢ)²)(nΣyᵢ²−(Σyᵢ)²)]

R² = r²

t = r√(n−2)/√(1−r²), df = n−2

Fisher z: z = 0.5·ln((1+r)/(1−r))

95% CI: z ± 1.96/√(n−3), back-transform

Regression: ŷ = a + bx, b = r×(sy/sx), a = ȳ − b×x̄

When to Use Pearson vs Other Correlations

MeasureUse WhenAssumptions
Pearson rLinear relationship, interval/ratio dataLinearity, normality, homoscedasticity
Spearman ρMonotonic, ordinal data, outliersNone
Kendall τSmall samples, many tiesNone

Disclaimer: This calculator provides Pearson correlation analysis for educational purposes. Correlation does not imply causation. Verify results for research or professional use. Uses Abramowitz & Stegun normal CDF approximation (accuracy ≈ 7.5×10⁻⁸).

👈 START HERE
⬅️Jump in and explore the concept!
AI

Related Calculators